You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Stephan Ewen <se...@apache.org> on 2015/04/01 10:50:50 UTC

Re: Problem with Amazon S3

It looks like the S3 URL is in an unexpected format. It tries to use the
bucketname as the hostname. Can you tell us the S3 URL (without user /
password), so we can take a look?

Greetings,
Stephan


On Tue, Mar 31, 2015 at 12:52 PM, pietro <pi...@gmail.com> wrote:

> Thank you Ufuk! That helped a lot.
>
> But I have an other problem now.
>
> Am I missing something?
>
> Caused by: java.net.UnknownHostException: MYBUCKETNAME
>         at java.net.InetAddress.getAllByName0(InetAddress.java:1250)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1162)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1098)
>         at
>
> org.apache.http.impl.conn.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:45)
>         at
>
> org.apache.http.impl.conn.DefaultClientConnectionOperator.resolveHostname(DefaultClientConnectionOperator.java:278)
>         at
>
> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:162)
>         at
>
> org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
>         at
>
> org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:641)
>         at
>
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
>         at
>
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
>         at
>
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
>         at
>
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:402)
>         ... 34 more
>
>         at
>
> org.apache.flink.runtime.fs.s3.S3FileSystem.initializeDirectoryStructure(S3FileSystem.java:248)
>         at
>
> org.apache.flink.runtime.fs.s3.S3FileSystem.initialize(S3FileSystem.java:222)
>         at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:258)
>         at org.apache.flink.core.fs.Path.getFileSystem(Path.java:310)
>         at
>
> org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:402)
>         at
>
> org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:51)
>         at
>
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:145)
>         ... 23 more
>
>
>
> --
> View this message in context:
> http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Problem-with-Amazon-S3-tp946p949.html
> Sent from the Apache Flink (Incubator) User Mailing List archive. mailing
> list archive at Nabble.com.
>

Re: Problem with Amazon S3

Posted by Robert Metzger <rm...@apache.org>.
Flink allows to use Hadoop's FileSystem interface as well [1].

Hadoop actually ships a s3 file system implementation by default, and I
suspect its in a better shape than Flink's implementation. Maybe it would
make sense to use Hadoop's S3 implementation through Flink's Hadoop FS
support.

Please let me know if you are facing any issues while using this approach.

[1]
http://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html

On Fri, May 22, 2015 at 6:27 PM, Aljoscha Krettek <al...@apache.org>
wrote:

> I am getting the same error as you are. Investigating now.
>
> On Wed, May 20, 2015 at 12:34 PM, pietro <pi...@gmail.com> wrote:
> > Dears,
> >
> > I am still having problem retriving data from the S3. I followed all you
> > indication in the previous posts, but now I get this error:
> >
> > 15/05/20 10:47:05 INFO s3.S3FileSystem: Creating new S3 file system
> binding
> > with Reduced Redundancy Storage enabled
> > 15/05/20 10:47:13 WARN io.DelimitedInputFormat: Could not determine
> > statistics for file 's3://
> genomic.s3-us-west-2.amazonaws.com/flink/ref/meta'
> > due to an io error: Cannot establish connection to Amazon S3:
> > com.amazonaws.services.s3.model.AmazonS3Exception: The request signature
> we
> > calculated does not match the signature you provided. Check your key and
> > signing method. (Service: Amazon S3; Status Code: 403; Error Code:
> > SignatureDoesNotMatch; Request ID: E719C84656C22D70), S3 Extended Request
> > ID:
> >
> 5yE3QhMxlrVuCiPe5lN/cVAWptceXRNuSUmIG9kwRtioimOX3znU4Fj3aY7+P1MTR4BTecyTvVM=
> >
> >
> >
> > I checked the keys in the flink-conf.yaml and they are correct.
> >
> > Any idea?
> >
> > Thanks
> >
> >
> >
> > --
> > View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Problem-with-Amazon-S3-tp946p1310.html
> > Sent from the Apache Flink User Mailing List archive. mailing list
> archive at Nabble.com.
>

Re: Problem with Amazon S3

Posted by Aljoscha Krettek <al...@apache.org>.
I am getting the same error as you are. Investigating now.

On Wed, May 20, 2015 at 12:34 PM, pietro <pi...@gmail.com> wrote:
> Dears,
>
> I am still having problem retriving data from the S3. I followed all you
> indication in the previous posts, but now I get this error:
>
> 15/05/20 10:47:05 INFO s3.S3FileSystem: Creating new S3 file system binding
> with Reduced Redundancy Storage enabled
> 15/05/20 10:47:13 WARN io.DelimitedInputFormat: Could not determine
> statistics for file 's3://genomic.s3-us-west-2.amazonaws.com/flink/ref/meta'
> due to an io error: Cannot establish connection to Amazon S3:
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we
> calculated does not match the signature you provided. Check your key and
> signing method. (Service: Amazon S3; Status Code: 403; Error Code:
> SignatureDoesNotMatch; Request ID: E719C84656C22D70), S3 Extended Request
> ID:
> 5yE3QhMxlrVuCiPe5lN/cVAWptceXRNuSUmIG9kwRtioimOX3znU4Fj3aY7+P1MTR4BTecyTvVM=
>
>
>
> I checked the keys in the flink-conf.yaml and they are correct.
>
> Any idea?
>
> Thanks
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Problem-with-Amazon-S3-tp946p1310.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Problem with Amazon S3

Posted by pietro <pi...@gmail.com>.
Dears,

I am still having problem retriving data from the S3. I followed all you
indication in the previous posts, but now I get this error:

15/05/20 10:47:05 INFO s3.S3FileSystem: Creating new S3 file system binding
with Reduced Redundancy Storage enabled
15/05/20 10:47:13 WARN io.DelimitedInputFormat: Could not determine
statistics for file 's3://genomic.s3-us-west-2.amazonaws.com/flink/ref/meta'
due to an io error: Cannot establish connection to Amazon S3:
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we
calculated does not match the signature you provided. Check your key and
signing method. (Service: Amazon S3; Status Code: 403; Error Code:
SignatureDoesNotMatch; Request ID: E719C84656C22D70), S3 Extended Request
ID:
5yE3QhMxlrVuCiPe5lN/cVAWptceXRNuSUmIG9kwRtioimOX3znU4Fj3aY7+P1MTR4BTecyTvVM=



I checked the keys in the flink-conf.yaml and they are correct.

Any idea?

Thanks



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Problem-with-Amazon-S3-tp946p1310.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Problem with Amazon S3

Posted by Ufuk Celebi <uc...@apache.org>.
Hey Pietro,

I've debugged this locally and I can get a connection to a S3 bucket with
the following format:

s3://<BUCKET NAME>.s3.amazonaws.com/<KEY>

Depending on the region of your S3 bucket, you have to use a different
endpoint (http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
).

So for example: s3://genomic.s3.amazonaws.com/flink/input/meta/1.txt
(assuming that "genomic" is your  bucket in US standard).

Does this work? Sorry for the inconvenience so far.



On Wed, Apr 1, 2015 at 10:58 AM, pietro <pi...@gmail.com> wrote:

> Hi Stephan,
>
> for sure I can: this is how I try to read from Flink:
>
> env.readFile(new DefaultReader(), "s3://genomic/flink/input/meta/1.txt"
> ).map(parser(_))}
>
> I use the same format I was used to use in Pig.
>
> Thanks,
>
>
>
>
> --
> View this message in context:
> http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Problem-with-Amazon-S3-tp946p952.html
> Sent from the Apache Flink (Incubator) User Mailing List archive. mailing
> list archive at Nabble.com.
>

Re: Problem with Amazon S3

Posted by pietro <pi...@gmail.com>.
Hi Stephan,

for sure I can: this is how I try to read from Flink:

env.readFile(new DefaultReader(), "s3://genomic/flink/input/meta/1.txt"
).map(parser(_))} 

I use the same format I was used to use in Pig.

Thanks,




--
View this message in context: http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Problem-with-Amazon-S3-tp946p952.html
Sent from the Apache Flink (Incubator) User Mailing List archive. mailing list archive at Nabble.com.