You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by SK <sk...@gmail.com> on 2015/08/06 19:27:11 UTC

Specifying the role when launching an AWS spark cluster using spark_ec2

Hi,

I need to access data on S3 from another account and I have been given the
IAM role information to access that S3 bucket. From what I understand, AWS
allows us to attach a role to a resource at the time it is created. However,
I don't see an option for specifying the role using the spark_ec2.py script. 
So I created a spark cluster using the default role, but I was not able to
change its IAM role after creation through AWS console.

I see a ticket for this issue:
https://github.com/apache/spark/pull/6962 and the status is closed. 

If anyone knows how I can specify the role using spark_ec2.py, please let me
know. I am using spark 1.4.1.

thanks



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Specifying-the-role-when-launching-an-AWS-spark-cluster-using-spark-ec2-tp24154.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Specifying the role when launching an AWS spark cluster using spark_ec2

Posted by Steve Loughran <st...@hortonworks.com>.
There's no support for IAM roles in the s3n:// client code in Apache Hadoop ( HADOOP-9384 ); Amazon's modified EMR distro may have it.. 

The s3a filesystem adds it, —this is ready for production use in Hadoop 2.7.1+ (implicitly HDP 2.3; CDH 5.4 has cherrypicked the relevant patches.) I don't know about the spark_ec2 scripts or what they start

> On 6 Aug 2015, at 10:27, SK <sk...@gmail.com> wrote:
> 
> Hi,
> 
> I need to access data on S3 from another account and I have been given the
> IAM role information to access that S3 bucket. From what I understand, AWS
> allows us to attach a role to a resource at the time it is created. However,
> I don't see an option for specifying the role using the spark_ec2.py script. 
> So I created a spark cluster using the default role, but I was not able to
> change its IAM role after creation through AWS console.
> 
> I see a ticket for this issue:
> https://github.com/apache/spark/pull/6962 and the status is closed. 
> 
> If anyone knows how I can specify the role using spark_ec2.py, please let me
> know. I am using spark 1.4.1.
> 
> thanks
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Specifying-the-role-when-launching-an-AWS-spark-cluster-using-spark-ec2-tp24154.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


RE: Specifying the role when launching an AWS spark cluster using spark_ec2

Posted by Ewan Leith <ew...@realitymine.com>.
You'll have a lot less hassle using the AWS EMR instances with Spark 1.4.1 for now, until the spark_ec2.py scripts move to Hadoop 2.7.1, at the moment I'm pretty sure it's only using Hadoop 2.4

The EMR setup with Spark lets you use s3:// URIs with IAM roles

Ewan

-----Original Message-----
From: SK [mailto:skrishna.id@gmail.com] 
Sent: 06 August 2015 18:27
To: user@spark.apache.org
Subject: Specifying the role when launching an AWS spark cluster using spark_ec2

Hi,

I need to access data on S3 from another account and I have been given the IAM role information to access that S3 bucket. From what I understand, AWS allows us to attach a role to a resource at the time it is created. However, I don't see an option for specifying the role using the spark_ec2.py script. 
So I created a spark cluster using the default role, but I was not able to change its IAM role after creation through AWS console.

I see a ticket for this issue:
https://github.com/apache/spark/pull/6962 and the status is closed. 

If anyone knows how I can specify the role using spark_ec2.py, please let me know. I am using spark 1.4.1.

thanks



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Specifying-the-role-when-launching-an-AWS-spark-cluster-using-spark-ec2-tp24154.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org For additional commands, e-mail: user-help@spark.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org