You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by robin_up <ro...@gmail.com> on 2014/01/28 04:59:56 UTC

SparkStreaming not read hadoop configuration from its sparkContext on Stand Alone mode?

Hi 

I try to run a small piece of code on Spark Steaming. It sets the s3 keys in
sparkContext object and passed into a sparkStreaming object. However, I got
the below error -- it seems StreamingContext did not use the hadoop config
on work threads. It works ok if I run it in spark core (batch mode) without
streaming.

java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key
must be specified as the username or password (respectively) of a s3n URL,
or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
properties (respectively). 


//my code: 

System.setProperty("spark.cleaner.ttl", "3600") 
        val spark_master = "spark://" + System.getenv("SPARK_MASTER_IP") +
":" + System.getenv("SPARK_MASTER_PORT") 
        val external_jars =
Seq("target/scala-2.9.3/test_2.9.3-1.0.jar","/opt/json4s-core_2.9.3-3.2.2.jar","/opt/json4s-native_2.9.3-3.2.2.jar","/opt/json4s-ast_2.9.3-3.2.2.jar") 

        val sc = new SparkContext(spark_master, "test",
System.getenv("SPARK_HOME"), external_jars) 
        sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId",
System.getenv("ds_awsAccessKeyId")) 
        sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey",
System.getenv("ds_awsSecretAccessKey")) 
        val ssc = new StreamingContext(sc, Seconds(5)) 

        val file =
ssc.textFileStream("s3n://my-bucket/syslog-ng/2014-01-24/") 



-----
-- Robin Li
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkStreaming-not-read-hadoop-configuration-from-its-sparkContext-on-Stand-Alone-mode-tp972.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: SparkStreaming not read hadoop configuration from its sparkContext on Stand Alone mode?

Posted by Tathagata Das <ta...@gmail.com>.

Which version of Spark are you trying this with?


On Mon, Jan 27, 2014 at 7:59 PM, robin_up <ro...@gmail.com> wrote:

> Hi
>
> I try to run a small piece of code on Spark Steaming. It sets the s3 keys
> in
> sparkContext object and passed into a sparkStreaming object. However, I got
> the below error -- it seems StreamingContext did not use the hadoop config
> on work threads. It works ok if I run it in spark core (batch mode) without
> streaming.
>
> java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key
> must be specified as the username or password (respectively) of a s3n URL,
> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
> properties (respectively).
>
>
> //my code:
>
> System.setProperty("spark.cleaner.ttl", "3600")
>         val spark_master = "spark://" + System.getenv("SPARK_MASTER_IP") +
> ":" + System.getenv("SPARK_MASTER_PORT")
>         val external_jars =
>
> Seq("target/scala-2.9.3/test_2.9.3-1.0.jar","/opt/json4s-core_2.9.3-3.2.2.jar","/opt/json4s-native_2.9.3-3.2.2.jar","/opt/json4s-ast_2.9.3-3.2.2.jar")
>
>         val sc = new SparkContext(spark_master, "test",
> System.getenv("SPARK_HOME"), external_jars)
>         sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId",
> System.getenv("ds_awsAccessKeyId"))
>         sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey",
> System.getenv("ds_awsSecretAccessKey"))
>         val ssc = new StreamingContext(sc, Seconds(5))
>
>         val file =
> ssc.textFileStream("s3n://my-bucket/syslog-ng/2014-01-24/")
>
>
>
> -----
> -- Robin Li
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/SparkStreaming-not-read-hadoop-configuration-from-its-sparkContext-on-Stand-Alone-mode-tp972.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>