You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by vinay patil <vi...@gmail.com> on 2017/03/06 14:51:32 UTC

Re: Integrate Flink with S3 on EMR cluster

Hi Guys,

I am getting the same exception:
EMRFileSystem not Found

I am trying to read encrypted S3 file using Hadoop File System class. 
(using Flink 1.2.0)
When I copy all the libs from /usr/share/aws/emrfs/lib and /usr/lib/hadoop
to Flink lib folder , it works.

However I see that all these libs are already included in the Hadoop
classpath.

Is there any other way I can make this work ?



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Integrate-Flink-with-S3-on-EMR-cluster-tp5894p12053.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Integrate Flink with S3 on EMR cluster

Posted by Stephan Ewen <se...@apache.org>.
@vinay patil - Can you see if the same problem occurs if you use Flink 1.1
- to see if this is a regression in Flink 1.2?



On Tue, Mar 7, 2017 at 6:43 PM, Shannon Carey <sc...@expedia.com> wrote:

> Generally, using S3 filesystem in EMR with Flink has worked pretty well
> for me in Flink < 1.2 (unless you run out of connections in your HTTP
> pool). When you say, "using Hadoop File System class", what do you mean?
> In my experience, it's sufficient to just use the "s3://" filesystem
> protocol and Flink's Hadoop integration (plus S3 filesystem classes
> provided by EMR) will do the right thing.
>
> -Shannon
>

Re: Integrate Flink with S3 on EMR cluster

Posted by Robert Metzger <rm...@apache.org>.
Hi Vinay,

using the HADOOP_CLASSPATH variable on the client machine is the
recommended way to solve this problem.

I'll update the documentation accordingly.


On Wed, Mar 8, 2017 at 10:26 AM, vinay patil <vi...@gmail.com>
wrote:

> Hi ,
>
> @Shannon - I am not facing any issue while writing to S3, was getting
> NoClassDef errors when reading the file from S3.
>
> ''Hadoop File System" - I mean I am using FileSystem class of Hadoop to
> read
> the file from S3.
>
> @Stephan - I tried with 1.1.4 , was getting the same issue.
>
> The easiest way I found is to run " hadoop classpath " command, and paste
> its value for export HADOOP_CLASSPATH variable.
>
> This way we don't have to copy any S3 specific jars to Flink lib folder.
>
>
>
> --
> View this message in context: http://apache-flink-user-
> mailing-list-archive.2336050.n4.nabble.com/Integrate-Flink-
> with-S3-on-EMR-cluster-tp5894p12101.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>

Re: Integrate Flink with S3 on EMR cluster

Posted by vinay patil <vi...@gmail.com>.
Hi ,

@Shannon - I am not facing any issue while writing to S3, was getting
NoClassDef errors when reading the file from S3.

''Hadoop File System" - I mean I am using FileSystem class of Hadoop to read
the file from S3.

@Stephan - I tried with 1.1.4 , was getting the same issue.

The easiest way I found is to run " hadoop classpath " command, and paste
its value for export HADOOP_CLASSPATH variable.

This way we don't have to copy any S3 specific jars to Flink lib folder.



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Integrate-Flink-with-S3-on-EMR-cluster-tp5894p12101.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Integrate Flink with S3 on EMR cluster

Posted by Shannon Carey <sc...@expedia.com>.
Generally, using S3 filesystem in EMR with Flink has worked pretty well for me in Flink < 1.2 (unless you run out of connections in your HTTP pool). When you say, "using Hadoop File System class", what do you mean? In my experience, it's sufficient to just use the "s3://" filesystem protocol and Flink's Hadoop integration (plus S3 filesystem classes provided by EMR) will do the right thing.

-Shannon

Re: Integrate Flink with S3 on EMR cluster

Posted by vinay patil <vi...@gmail.com>.
Hi Guys,

Has anyone got this error before ? If yes, have you found any other
solution apart from copying the jar files to flink lib folder

Regards,
Vinay Patil

On Mon, Mar 6, 2017 at 8:21 PM, vinay patil [via Apache Flink User Mailing
List archive.] <ml...@n4.nabble.com> wrote:

> Hi Guys,
>
> I am getting the same exception:
> EMRFileSystem not Found
>
> I am trying to read encrypted S3 file using Hadoop File System class.
>  (using Flink 1.2.0)
> When I copy all the libs from /usr/share/aws/emrfs/lib and /usr/lib/hadoop
> to Flink lib folder , it works.
>
> However I see that all these libs are already included in the Hadoop
> classpath.
>
> Is there any other way I can make this work ?
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/Integrate-Flink-with-S3-on-EMR-cluster-tp5894p12053.html
> To start a new topic under Apache Flink User Mailing List archive., email
> ml-node+s2336050n1h83@n4.nabble.com
> To unsubscribe from Apache Flink User Mailing List archive., click here
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmluYXkxOC5wYXRpbEBnbWFpbC5jb218MXwxODExMDE2NjAx>
> .
> NAML
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Integrate-Flink-with-S3-on-EMR-cluster-tp5894p12072.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.