You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Hitesh Goyal <hi...@nlpcaptcha.com> on 2016/09/28 05:28:17 UTC

Trying to fetch S3 data

Hi team,

I want to fetch data from Amazon S3 bucket. For this, I am trying to access it using scala.
I have tried the basic wordcount application in scala.
Now I want to retrieve s3 data using it.
I have gone through the tutorials and I found solutions for uploading files to S3.
Please tell me how can I retrieve the data buckets stored in S3.

Regards,
Hitesh Goyal
Simpli5d Technologies
Cont No.: 9996588220


Re: Trying to fetch S3 data

Posted by Steve Loughran <st...@hortonworks.com>.
On 28 Sep 2016, at 06:28, Hitesh Goyal <hi...@nlpcaptcha.com>> wrote:

Hi team,

I want to fetch data from Amazon S3 bucket. For this, I am trying to access it using scala.
I have tried the basic wordcount application in scala.
Now I want to retrieve s3 data using it.
I have gone through the tutorials and I found solutions for uploading files to S3.
Please tell me how can I retrieve the data buckets stored in S3.


This is actually something I'm trying to document as part of some work to add a spark-cloud module which adds the appropriate JARs to the classpath for things to work out the box.

Could you have a look at

https://github.com/steveloughran/spark/blob/b04c037f2925d9b698e541493fc936627ddcf9ba/docs/cloud-integration.md

And tell me where it could be improved?