You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mce <gi...@git.apache.org> on 2015/04/09 13:32:21 UTC

[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

GitHub user mce opened a pull request:

    https://github.com/apache/spark/pull/5439

    SPARK-5960 allow aws credentials to be passed to KinesisUtils.createStream

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vngrs/spark SPARK-5960

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5439
    
----
commit 84eea143bb2319447cddaa9b2837993cf525880a
Author: Cihan Emre <ci...@vngrs.com>
Date:   2015-04-08T11:59:31Z

    SPARK-5960 allow aws credentials to be passed to KinesisUtils.createStream

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390845
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/examples/streaming/KinesisWordCountASL.scala ---
    @@ -113,7 +113,7 @@ private object KinesisWordCountASL extends Logging {
         /* Create the same number of Kinesis DStreams/Receivers as Kinesis stream's shards */
         val kinesisStreams = (0 until numStreams).map { i =>
           KinesisUtils.createStream(ssc, streamName, endpointUrl, kinesisCheckpointInterval,
    -          InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2)
    +          InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2, None)
    --- End diff --
    
    would be nice to have a true default here versus adding to the signature


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-92027699
  
      [Test build #30115 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30115/consoleFull) for   PR 5439 at commit [`bb36da0`](https://github.com/apache/spark/commit/bb36da028c4fae5678df2f2901e227354b49a9a8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28054199
  
    --- Diff: extras/kinesis-asl/pom.xml ---
    @@ -48,6 +48,11 @@
           <scope>test</scope>
         </dependency>
         <dependency>
    +      <groupId>com.fasterxml.jackson.core</groupId>
    --- End diff --
    
    (Why is this needed?)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28408535
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala ---
    @@ -24,8 +24,7 @@ import org.apache.spark.storage.StorageLevel
     import org.apache.spark.streaming.Duration
     import org.apache.spark.streaming.receiver.Receiver
     
    -import com.amazonaws.auth.AWSCredentialsProvider
    -import com.amazonaws.auth.DefaultAWSCredentialsProviderChain
    +import com.amazonaws.auth.{AWSCredentials, AWSCredentialsProvider, DefaultAWSCredentialsProviderChain}
     import com.amazonaws.services.kinesis.clientlibrary.interfaces.IRecordProcessor
    --- End diff --
    
    done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-96767499
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28409139
  
    --- Diff: docs/streaming-kinesis-integration.md ---
    @@ -32,7 +32,7 @@ A Kinesis stream can be set up at one of the valid Kinesis endpoints with 1 or m
     		import com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream
     
     		val kinesisStream = KinesisUtils.createStream(
    -        	streamingContext, [Kinesis stream name], [endpoint URL], [checkpoint interval], [initial position])
    +        	streamingContext, [Kinesis stream name], [endpoint URL], [checkpoint interval], [initial position], [credentials])
    --- End diff --
    
    added more explanation, please see line 72 https://github.com/vngrs/spark/commit/5cfd4377f432d821584a28b856948ebde9696339#diff-7f6eb703c33969d0218557112e1992b3R72 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-91541414
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-92039483
  
      [Test build #30115 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30115/consoleFull) for   PR 5439 at commit [`bb36da0`](https://github.com/apache/spark/commit/bb36da028c4fae5678df2f2901e227354b49a9a8).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93348547
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30334/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28140170
  
    --- Diff: extras/kinesis-asl/pom.xml ---
    @@ -48,6 +48,11 @@
           <scope>test</scope>
         </dependency>
         <dependency>
    +      <groupId>com.fasterxml.jackson.core</groupId>
    --- End diff --
    
    Hm, I doubt this is the right way to fix it, since you are declaring a compile-time dependency on something that you don't use. Surely it's because there is a missing dep on some Kinesis component at runtime scope? you also shouldn't manage versions here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28140844
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/Credentials.scala ---
    @@ -0,0 +1,24 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.streaming.kinesis
    +
    +import com.amazonaws.auth.AWSCredentials
    +
    +case class Credentials(accessKeyId: String, secretKey: String) extends AWSCredentials {
    --- End diff --
    
    Hm, I wonder if that actually solves the problem since it still extends the AWS class. If the point is to pass access / secret key, that can be done by properties already right? so the point of this must be to enable other credential types? in which case it may be unavoidable to use the AWS class, and maybe fine given the nature of this module. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28408842
  
    --- Diff: extras/kinesis-asl/src/test/scala/org/apache/spark/streaming/kinesis/KinesisReceiverSuite.scala ---
    @@ -86,7 +86,7 @@ class KinesisReceiverSuite extends TestSuiteBase with Matchers with BeforeAndAft
         // Tests the API, does not actually test data receiving
         val kinesisStream = KinesisUtils.createStream(ssc, "mySparkStream",
           "https://kinesis.us-west-2.amazonaws.com", Seconds(2),
    -      InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2);
    +      InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2, None)
    --- End diff --
    
    added tests for both cases 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93348502
  
      [Test build #30334 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30334/consoleFull) for   PR 5439 at commit [`5cfd437`](https://github.com/apache/spark/commit/5cfd4377f432d821584a28b856948ebde9696339).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390830
  
    --- Diff: extras/kinesis-asl/src/test/scala/org/apache/spark/streaming/kinesis/KinesisReceiverSuite.scala ---
    @@ -86,7 +86,7 @@ class KinesisReceiverSuite extends TestSuiteBase with Matchers with BeforeAndAft
         // Tests the API, does not actually test data receiving
         val kinesisStream = KinesisUtils.createStream(ssc, "mySparkStream",
           "https://kinesis.us-west-2.amazonaws.com", Seconds(2),
    -      InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2);
    +      InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2, None)
    --- End diff --
    
    would be good to have tests around setting the credentials in the following 2 cases:
    1) None (null)
    2) Some[AWSCredentials]


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93668194
  
      [Test build #30406 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30406/consoleFull) for   PR 5439 at commit [`7789df7`](https://github.com/apache/spark/commit/7789df78ab7d55b22c91d3faca63942de0b44a62).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28140876
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala ---
    @@ -90,8 +93,9 @@ object KinesisUtils {
           endpointUrl: String, 
           checkpointInterval: Duration,
           initialPositionInStream: InitialPositionInStream,
    -      storageLevel: StorageLevel): JavaReceiverInputDStream[Array[Byte]] = {
    +      storageLevel: StorageLevel,
    --- End diff --
    
    This changes the Java API right? you have to pass credentials now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390844
  
    --- Diff: extras/kinesis-asl/src/main/java/org/apache/spark/examples/streaming/JavaKinesisWordCountASL.java ---
    @@ -131,7 +131,7 @@ public static void main(String[] args) {
             for (int i = 0; i < numStreams; i++) {
               streamsList.add(
                 KinesisUtils.createStream(jssc, streamName, endpointUrl, checkpointInterval, 
    -            InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2())
    +            InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2(), null)
    --- End diff --
    
    would be nice to have a true default here versus adding to the signature


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28408803
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala ---
    @@ -59,9 +61,10 @@ object KinesisUtils {
           endpointUrl: String,
           checkpointInterval: Duration,
           initialPositionInStream: InitialPositionInStream,
    -      storageLevel: StorageLevel): ReceiverInputDStream[Array[Byte]] = {
    +      storageLevel: StorageLevel,
    +      credentials: Option[AWSCredentials]): ReceiverInputDStream[Array[Byte]] = {
    --- End diff --
    
    updated the parameter with default value ```None``` which means ```DefaultAWSCredentialsProvider``` will be used in the end.
    
    However i couldn't to do the same for the Java api. Any suggestions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-92042569
  
    @srowen tests passed, fyi


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390315
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala ---
    @@ -82,15 +82,19 @@ private[kinesis] class KinesisReceiver(
       var workerId: String = null
     
       /*
    -   * This impl uses the DefaultAWSCredentialsProviderChain and searches for credentials 
    -   *   in the following order of precedence:
    +   * This impl uses the DefaultAWSCredentialsProviderChain unless it's provided by constructor
    +   *  and searches for credentials in the following order of precedence:
    --- End diff --
    
    might be good to reword this to be more explicit about the 2 different scenarios:
    1) Some[AWSCredentials] are provided in which case we use those
    2) None are provided and therefore DefaultAWSCredentialsProviderChain will be constructed which searches for credentials in the following order...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28054428
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala ---
    @@ -16,6 +16,7 @@
      */
     package org.apache.spark.streaming.kinesis
     
    +import com.amazonaws.auth.AWSCredentials
    --- End diff --
    
    Nit: space after import. 
    
    I know we have all lived to regret including third-party classes as part of the API. Although it seems maybe essential here, I wonder if there is any reasonable way to not accept this as `AWSCredentials` but as access keys or something. I suppose that this could work but would assume a certain type of credentials are being used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28145657
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala ---
    @@ -90,8 +93,9 @@ object KinesisUtils {
           endpointUrl: String, 
           checkpointInterval: Duration,
           initialPositionInStream: InitialPositionInStream,
    -      storageLevel: StorageLevel): JavaReceiverInputDStream[Array[Byte]] = {
    +      storageLevel: StorageLevel,
    --- End diff --
    
    yes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390951
  
    --- Diff: docs/streaming-kinesis-integration.md ---
    @@ -44,7 +44,7 @@ A Kinesis stream can be set up at one of the valid Kinesis endpoints with 1 or m
     		import com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream;
     
     		JavaReceiverInputDStream<byte[]> kinesisStream = KinesisUtils.createStream(
    -        	streamingContext, [Kinesis stream name], [endpoint URL], [checkpoint interval], [initial position]);
    +        	streamingContext, [Kinesis stream name], [endpoint URL], [checkpoint interval], [initial position], [credentials]);
    --- End diff --
    
    same here :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93686710
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30406/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28125817
  
    --- Diff: extras/kinesis-asl/pom.xml ---
    @@ -48,6 +48,11 @@
           <scope>test</scope>
         </dependency>
         <dependency>
    +      <groupId>com.fasterxml.jackson.core</groupId>
    --- End diff --
    
    tests are failing on kinesis-acl, at least on my local. Error message is saying a method is missing from a class in jackson library. I don't know the exact reason but adding this dependency fixes the error


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-91203788
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-97622154
  
    I am taking a look at this PR a little late. But I am not sure how this work at all. AWSCredentials is not serializable. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93686694
  
      [Test build #30406 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30406/consoleFull) for   PR 5439 at commit [`7789df7`](https://github.com/apache/spark/commit/7789df78ab7d55b22c91d3faca63942de0b44a62).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390948
  
    --- Diff: docs/streaming-kinesis-integration.md ---
    @@ -32,7 +32,7 @@ A Kinesis stream can be set up at one of the valid Kinesis endpoints with 1 or m
     		import com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream
     
     		val kinesisStream = KinesisUtils.createStream(
    -        	streamingContext, [Kinesis stream name], [endpoint URL], [checkpoint interval], [initial position])
    +        	streamingContext, [Kinesis stream name], [endpoint URL], [checkpoint interval], [initial position], [credentials])
    --- End diff --
    
    might be good to include some more details on what this is, exactly.
    
    this is a bit confusing if you're thinking of this in terms of ACCESS_KEY_ID and SECRET_KEY (2 separate strings)
    
    also, be sure to include any imports that are relevant to this code sample - similar to the InitialPositionInStream reference right before it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28054255
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala ---
    @@ -70,6 +69,20 @@ private[kinesis] class KinesisReceiver(
         storageLevel: StorageLevel)
       extends Receiver[Array[Byte]](storageLevel) with Logging { receiver =>
     
    +  def this(appName: String,
    +           streamName: String,
    +           endpointUrl: String,
    +           checkpointInterval: Duration,
    +           initialPositionInStream: InitialPositionInStream,
    +           storageLevel: StorageLevel,
    +           credentials: AWSCredentials) = {
    --- End diff --
    
    Since this is internal, can this just be implemented with an optional final parameter to the existing declared constructor?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93354224
  
      [Test build #30335 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30335/consoleFull) for   PR 5439 at commit [`9877a98`](https://github.com/apache/spark/commit/9877a980e79d37a5cc2d999c0dceedcdeacfca30).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28408687
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala ---
    @@ -82,15 +82,19 @@ private[kinesis] class KinesisReceiver(
       var workerId: String = null
     
       /*
    -   * This impl uses the DefaultAWSCredentialsProviderChain and searches for credentials 
    -   *   in the following order of precedence:
    +   * This impl uses the DefaultAWSCredentialsProviderChain unless it's provided by constructor
    +   *  and searches for credentials in the following order of precedence:
        * Environment Variables - AWS_ACCESS_KEY_ID and AWS_SECRET_KEY
        * Java System Properties - aws.accessKeyId and aws.secretKey
        * Credential profiles file at the default location (~/.aws/credentials) shared by all 
        *   AWS SDKs and the AWS CLI
        * Instance profile credentials delivered through the Amazon EC2 metadata service
        */
    -  var credentialsProvider: AWSCredentialsProvider = null
    +  var credentialsProvider: AWSCredentialsProvider = (credentials map { cr => new AWSCredentialsProvider {
    --- End diff --
    
    i don't know if there's a simpler way but i'll think about it.
    
    although the unit tests are passing now, i'll run example applications to check if the serialization issue you mentioned still exists 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28202950
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala ---
    @@ -90,8 +93,9 @@ object KinesisUtils {
           endpointUrl: String, 
           checkpointInterval: Duration,
           initialPositionInStream: InitialPositionInStream,
    -      storageLevel: StorageLevel): JavaReceiverInputDStream[Array[Byte]] = {
    +      storageLevel: StorageLevel,
    --- End diff --
    
    Yes, the idea is not to change the API. I suppose this is less important as a module, but still. I would have thought MiMa would flag this but I don't know that it checks this module.
    
    I think this is really something for @cfregly et al to review


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-91575748
  
      [Test build #30028 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30028/consoleFull) for   PR 5439 at commit [`d93dcb6`](https://github.com/apache/spark/commit/d93dcb6a77e0660f3391f54c216f5a241b77a508).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Credentials(accessKeyId: String, secretKey: String) extends AWSCredentials `
    
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28125877
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala ---
    @@ -16,6 +16,7 @@
      */
     package org.apache.spark.streaming.kinesis
     
    +import com.amazonaws.auth.AWSCredentials
    --- End diff --
    
    Updated code a bit after your comments. 
    
    Created ```Credentials``` class to avoid third-party import in api.
    Moved credentials parameter to existing constructor as an optional parameter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-94652390
  
    @cfregly any other comments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-91543091
  
      [Test build #30028 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30028/consoleFull) for   PR 5439 at commit [`d93dcb6`](https://github.com/apache/spark/commit/d93dcb6a77e0660f3391f54c216f5a241b77a508).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390431
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala ---
    @@ -82,15 +82,19 @@ private[kinesis] class KinesisReceiver(
       var workerId: String = null
     
       /*
    -   * This impl uses the DefaultAWSCredentialsProviderChain and searches for credentials 
    -   *   in the following order of precedence:
    +   * This impl uses the DefaultAWSCredentialsProviderChain unless it's provided by constructor
    +   *  and searches for credentials in the following order of precedence:
        * Environment Variables - AWS_ACCESS_KEY_ID and AWS_SECRET_KEY
        * Java System Properties - aws.accessKeyId and aws.secretKey
        * Credential profiles file at the default location (~/.aws/credentials) shared by all 
        *   AWS SDKs and the AWS CLI
        * Instance profile credentials delivered through the Amazon EC2 metadata service
        */
    -  var credentialsProvider: AWSCredentialsProvider = null
    +  var credentialsProvider: AWSCredentialsProvider = (credentials map { cr => new AWSCredentialsProvider {
    --- End diff --
    
    i seem to remember having problems when constructing the AWSCredentialsProvider outside of the onStart() method due to this hierarchy (including DefaultAWSCredentialsProviderChain) not being Serializable.
    
    i assume this works, but wondering what may have changed.
    
    also, is there a simpler way to do this in Scala?  i feel like this could be simplified, but maybe i'm missing something.  seems like a lot of syntax for something so simple.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390832
  
    --- Diff: extras/kinesis-asl/src/test/java/org/apache/spark/streaming/kinesis/JavaKinesisStreamSuite.java ---
    @@ -34,7 +34,7 @@ public void testKinesisStream() {
         // Tests the API, does not actually test data receiving
         JavaDStream<byte[]> kinesisStream = KinesisUtils.createStream(ssc, "mySparkStream",
             "https://kinesis.us-west-2.amazonaws.com", new Duration(2000), 
    -        InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2());
    +        InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2(), null);
    --- End diff --
    
    would be good to have tests around setting the credentials in the following 2 cases:
    1) None (null)
    2) Some[AWSCredentials]


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390980
  
    --- Diff: docs/streaming-kinesis-integration.md ---
    @@ -67,6 +67,8 @@ A Kinesis stream can be set up at one of the valid Kinesis endpoints with 1 or m
     
     	- `[initial position]`: Can be either `InitialPositionInStream.TRIM_HORIZON` or `InitialPositionInStream.LATEST` (see Kinesis Checkpointing section and Amazon Kinesis API documentation for more details).
     
    +	- `[credentials]`: Optional AWS credentials.
    --- End diff --
    
    it's a bit misleading to say this is optional since we still need to provide a value (None/null or Some)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93398033
  
      [Test build #30335 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30335/consoleFull) for   PR 5439 at commit [`9877a98`](https://github.com/apache/spark/commit/9877a980e79d37a5cc2d999c0dceedcdeacfca30).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r29395910
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala ---
    @@ -68,7 +64,8 @@ private[kinesis] class KinesisReceiver(
         endpointUrl: String,
         checkpointInterval: Duration,
         initialPositionInStream: InitialPositionInStream,
    -    storageLevel: StorageLevel)
    +    storageLevel: StorageLevel,
    +    credentials: Option[AWSCredentials])
    --- End diff --
    
    I am really not sure how this will work. AWSCredentials is not serializable, while the Receiver has to be serializable (as the receiver object is sent to the executors from the driver). Did you test whether this works?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-91575769
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30028/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93687457
  
    @cfregly Moved initialization code into ```onStart``` method to fix serialization error. I tested by running example application and it runs without any problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93306684
  
      [Test build #30334 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30334/consoleFull) for   PR 5439 at commit [`5cfd437`](https://github.com/apache/spark/commit/5cfd4377f432d821584a28b856948ebde9696339).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390679
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala ---
    @@ -16,6 +16,7 @@
      */
     package org.apache.spark.streaming.kinesis
     
    +import com.amazonaws.auth.AWSCredentials
    --- End diff --
    
    my initial attempt at adding this support introduced a BasicAWSCredentialsProvider as follows:
    
    https://github.com/cfregly/spark/blob/kinesis-dbc/extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/BasicAWSCredentialsProvider.scala
    
    highlights:
    1) takes AccessKey and SecretKey as Strings
    2) follows the same hierarchy as the rest of the AWSCredentialsProviders (including not being Serializable)
    
    
    
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390150
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala ---
    @@ -24,8 +24,7 @@ import org.apache.spark.storage.StorageLevel
     import org.apache.spark.streaming.Duration
     import org.apache.spark.streaming.receiver.Receiver
     
    -import com.amazonaws.auth.AWSCredentialsProvider
    -import com.amazonaws.auth.DefaultAWSCredentialsProviderChain
    +import com.amazonaws.auth.{AWSCredentials, AWSCredentialsProvider, DefaultAWSCredentialsProviderChain}
     import com.amazonaws.services.kinesis.clientlibrary.interfaces.IRecordProcessor
    --- End diff --
    
    collapse the IRecordProcessor and IRecordProcessorFactory into a single import, as well?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28391006
  
    --- Diff: extras/kinesis-asl/src/main/java/org/apache/spark/examples/streaming/JavaKinesisWordCountASL.java ---
    @@ -131,7 +131,7 @@ public static void main(String[] args) {
             for (int i = 0; i < numStreams; i++) {
               streamsList.add(
                 KinesisUtils.createStream(jssc, streamName, endpointUrl, checkpointInterval, 
    -            InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2())
    +            InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2(), null)
    --- End diff --
    
    also, might want to name this variable to be more explicit.  you can set it to None/null, but naming params goes a long way in these examples.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-93398087
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30335/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28408838
  
    --- Diff: extras/kinesis-asl/src/test/java/org/apache/spark/streaming/kinesis/JavaKinesisStreamSuite.java ---
    @@ -34,7 +34,7 @@ public void testKinesisStream() {
         // Tests the API, does not actually test data receiving
         JavaDStream<byte[]> kinesisStream = KinesisUtils.createStream(ssc, "mySparkStream",
             "https://kinesis.us-west-2.amazonaws.com", new Duration(2000), 
    -        InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2());
    +        InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2(), null);
    --- End diff --
    
    added tests for both cases 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28054339
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala ---
    @@ -119,7 +132,9 @@ private[kinesis] class KinesisReceiver(
        */
       override def onStart() {
         workerId = InetAddress.getLocalHost.getHostAddress() + ":" + UUID.randomUUID()
    -    credentialsProvider = new DefaultAWSCredentialsProviderChain()
    +    if(credentialsProvider == null) {
    +      credentialsProvider = new DefaultAWSCredentialsProviderChain()
    --- End diff --
    
    Can this simply be the default value of the field rather than delay its init? or is this expensive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by mce <gi...@git.apache.org>.
Github user mce commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28419555
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala ---
    @@ -82,15 +82,19 @@ private[kinesis] class KinesisReceiver(
       var workerId: String = null
     
       /*
    -   * This impl uses the DefaultAWSCredentialsProviderChain and searches for credentials 
    -   *   in the following order of precedence:
    +   * This impl uses the DefaultAWSCredentialsProviderChain unless it's provided by constructor
    +   *  and searches for credentials in the following order of precedence:
        * Environment Variables - AWS_ACCESS_KEY_ID and AWS_SECRET_KEY
        * Java System Properties - aws.accessKeyId and aws.secretKey
        * Credential profiles file at the default location (~/.aws/credentials) shared by all 
        *   AWS SDKs and the AWS CLI
        * Instance profile credentials delivered through the Amazon EC2 metadata service
        */
    -  var credentialsProvider: AWSCredentialsProvider = null
    +  var credentialsProvider: AWSCredentialsProvider = (credentials map { cr => new AWSCredentialsProvider {
    --- End diff --
    
    I just tested the new code by running the example applications and it seems the serialization problem still happening. I'll move the initialization into onStart method and run the tests again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28390749
  
    --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala ---
    @@ -59,9 +61,10 @@ object KinesisUtils {
           endpointUrl: String,
           checkpointInterval: Duration,
           initialPositionInStream: InitialPositionInStream,
    -      storageLevel: StorageLevel): ReceiverInputDStream[Array[Byte]] = {
    +      storageLevel: StorageLevel,
    +      credentials: Option[AWSCredentials]): ReceiverInputDStream[Array[Byte]] = {
    --- End diff --
    
    any thought to using a default param where new DefaultAWSCredentialsProvider() is the default to reduce the API changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by cfregly <gi...@git.apache.org>.
Github user cfregly commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5439#discussion_r28391083
  
    --- Diff: extras/kinesis-asl/src/main/java/org/apache/spark/examples/streaming/JavaKinesisWordCountASL.java ---
    @@ -131,7 +131,7 @@ public static void main(String[] args) {
             for (int i = 0; i < numStreams; i++) {
               streamsList.add(
                 KinesisUtils.createStream(jssc, streamName, endpointUrl, checkpointInterval, 
    -            InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2())
    +            InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2(), null)
    --- End diff --
    
    another option is to require a non-null value to tighten up the interface - DefaultAWSCredentialsProvider() would be recommended, but people can pass in their own.
    
    again, please make sure there are no Serializable issues if you go this route.  i remember having issues here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-5960 allow aws credentials to be passed ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5439#issuecomment-92039536
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30115/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org