You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by ant burton <ap...@gmail.com> on 2017/08/12 00:38:23 UTC

No file system found with scheme s3

Hello,

After following the instructions to set the S3 filesystem in the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#set-s3-filesystem <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#set-s3-filesystem>) I encountered the following error:

No file system found with scheme s3, referenced in file URI 's3://<bucket>/<endpoint>'.
The documentation goes on to say  “If your job submission fails with an Exception message noting that No file system found with scheme s3 this means that no FileSystem has been configured for S3. Please check out the FileSystem Configuration section <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#set-s3-filesystem> for details on how to configure this properly."

After checking over the configuration the error persisted. My configuration is as follows.

I am using the docker image flink:1.3.1, with command: local

# flink --version
Version: 1.3.1, Commit ID: 1ca6e5b

# cat flink/config/flink-conf.yaml | head -n1
fs.hdfs.hadoopconf: /root/hadoop-config

The rest of the content of flink-conf.yaml is identical to the release version.

The following was added to /root/hadoop-config/core-site.xml, I understand this is used internally by flink as configuration for “org.apache.hadoop.fs.s3a.S3AFileSystem”

I’ve removed my AWS access key and secret for obvious reasons, they are present in the actual file ;-)

# cat  /root/hadoop-config/core-site.xml
<configuration>
    <property>
        <name>fs.s3.impl</name>
        <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
    </property>

    <property>
        <name>fs.s3a.buffer.dir</name>
        <value>/tmp</value>
    </property>

    <property>
        <name>fs.s3a.access.key</name>
        <value>MY_ACCESS_KEY</value>
    </property>

    <property>
        <name>fs.s3a.secret.key</name>
        <value>MY_SECRET_KEY</value>
    </property>
</configuration>

The JAR’s aws-java-sdk-1.7.4.jar, hadoop-aws-2.7.4.jar, httpclient-4.2.5.jar, httpcore-4.2.5.jar where added to flink/lib/ from http://apache.mirror.anlx.net/hadoop/common/hadoop-2.7.4/hadoop-2.7.4.tar.gz <http://apache.mirror.anlx.net/hadoop/common/hadoop-2.7.4/hadoop-2.7.4.tar.gz> 

# ls flink/lib/
aws-java-sdk-1.7.4.jar
flink-dist_2.11-1.3.1.jar
flink-python_2.11-1.3.1.jar
flink-shaded-hadoop2-uber-1.3.1.jar
hadoop-aws-2.7.4.jar
httpclient-4.2.5.jar
httpcore-4.2.5.jar
log4j-1.2.17.jar
slf4j-log4j12-1.7.7.jar

I’m using the streaming api, with the following example:

// Set StreamExecutionEnvironment
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

// Set checkpoints in ms
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

// Add source (input stream)
DataStream<String> dataStream = StreamUtil.getDataStream(env, params);

// Sink to S3 Bucket
dataStream.writeAsText("s3://test-flink/test.txt <s3://test-flink/test.txt>").setParallelism(1);

pom.xml has the following build dependencies.

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-filesystem_2.10</artifactId>
            <version>1.3.1</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-aws</artifactId>
            <version>2.7.2</version>
        </dependency>


Would anybody be able to spare some time to help me resolve my problem? I'm sure I’m missing something simple here.

Thanks  :-)

Re: No file system found with scheme s3

Posted by Ted Yu <yu...@gmail.com>.
Shouldn't the config key be :

org.apache.hadoop.fs.s3.S3FileSystem

Cheers

On Fri, Aug 11, 2017 at 5:38 PM, ant burton <ap...@gmail.com> wrote:

> Hello,
>
> After following the instructions to set the S3 filesystem in the
> documentation (https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/aws.html#set-s3-filesystem) I encountered the following
> error:
>
> No file system found with scheme s3, referenced in file URI 's3://<bucket>/<endpoint>'.
>
> The documentation goes on to say  “If your job submission fails with an
> Exception message noting that No file system found with scheme s3 this
> means that no FileSystem has been configured for S3. Please check out the FileSystem
> Configuration section
> <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#set-s3-filesystem> for
> details on how to configure this properly."
>
> After checking over the configuration the error persisted. My
> configuration is as follows.
>
> I am using the docker image flink:1.3.1, with command: local
>
> # flink --version
> Version: 1.3.1, Commit ID: 1ca6e5b
>
> # cat flink/config/flink-conf.yaml | head -n1
> fs.hdfs.hadoopconf: /root/hadoop-config
>
> The rest of the content of flink-conf.yaml is identical to the release
> version.
>
> The following was added to /root/hadoop-config/core-site.xml,
> I understand this is used internally by flink as configuration
> for “org.apache.hadoop.fs.s3a.S3AFileSystem”
>
> I’ve removed my AWS access key and secret for obvious reasons, they are
> present in the actual file ;-)
>
> # cat  /root/hadoop-config/core-site.xml
> <configuration>
>     <property>
>         <name>fs.s3.impl</name>
>         <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
>     </property>
>
>     <property>
>         <name>fs.s3a.buffer.dir</name>
>         <value>/tmp</value>
>     </property>
>
>     <property>
>         <name>fs.s3a.access.key</name>
>         <value>MY_ACCESS_KEY</value>
>     </property>
>
>     <property>
>         <name>fs.s3a.secret.key</name>
>         <value>MY_SECRET_KEY</value>
>     </property>
> </configuration>
>
> The JAR’s aws-java-sdk-1.7.4.jar, hadoop-aws-2.7.4.jar,
> httpclient-4.2.5.jar, httpcore-4.2.5.jar where added to flink/lib/ from
> http://apache.mirror.anlx.net/hadoop/common/hadoop-
> 2.7.4/hadoop-2.7.4.tar.gz
>
> # ls flink/lib/
> aws-java-sdk-1.7.4.jar
> flink-dist_2.11-1.3.1.jar
> flink-python_2.11-1.3.1.jar
> flink-shaded-hadoop2-uber-1.3.1.jar
> hadoop-aws-2.7.4.jar
> httpclient-4.2.5.jar
> httpcore-4.2.5.jar
> log4j-1.2.17.jar
> slf4j-log4j12-1.7.7.jar
>
> I’m using the streaming api, with the following example:
>
> // Set StreamExecutionEnvironment
> final StreamExecutionEnvironment env = StreamExecutionEnvironment.
> getExecutionEnvironment();
>
> // Set checkpoints in ms
> env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
>
> // Add source (input stream)
> DataStream<String> dataStream = StreamUtil.getDataStream(env, params);
>
> // Sink to S3 Bucket
> dataStream.writeAsText("s3://test-flink/test.txt").setParallelism(1);
>
> pom.xml has the following build dependencies.
>
>         <dependency>
>             <groupId>org.apache.flink</groupId>
>             <artifactId>flink-connector-filesystem_2.10</artifactId>
>             <version>1.3.1</version>
>         </dependency>
>
>         <dependency>
>             <groupId>org.apache.hadoop</groupId>
>             <artifactId>hadoop-aws</artifactId>
>             <version>2.7.2</version>
>         </dependency>
>
>
> Would anybody be able to spare some time to help me resolve my problem?
> I'm sure I’m missing something simple here.
>
> Thanks  :-)
>

Re: No file system found with scheme s3

Posted by vinay patil <vi...@gmail.com>.
Hi,

The config should be *fs.s3a.impl* instead of *fs.s3.impl*

Also when you are providing the S3 write path in config file or directly in
code start with *s3a://<path_to_write_data>*



Regards,
Vinay Patil

On Sat, Aug 12, 2017 at 6:07 AM, ant burton [via Apache Flink User Mailing
List archive.] <ml...@n4.nabble.com> wrote:

> Hello,
>
> After following the instructions to set the S3 filesystem in the
> documentation (https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/aws.html#set-s3-filesystem) I encountered the following
> error:
>
> No file system found with scheme s3, referenced in file URI 's3://<bucket>/<endpoint>'.
>
> The documentation goes on to say  “If your job submission fails with an
> Exception message noting that No file system found with scheme s3 this
> means that no FileSystem has been configured for S3. Please check out the FileSystem
> Configuration section
> <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#set-s3-filesystem> for
> details on how to configure this properly."
>
> After checking over the configuration the error persisted. My
> configuration is as follows.
>
> I am using the docker image flink:1.3.1, with command: local
>
> # flink --version
> Version: 1.3.1, Commit ID: 1ca6e5b
>
> # cat flink/config/flink-conf.yaml | head -n1
> fs.hdfs.hadoopconf: /root/hadoop-config
>
> The rest of the content of flink-conf.yaml is identical to the release
> version.
>
> The following was added to /root/hadoop-config/core-site.xml,
> I understand this is used internally by flink as configuration
> for “org.apache.hadoop.fs.s3a.S3AFileSystem”
>
> I’ve removed my AWS access key and secret for obvious reasons, they are
> present in the actual file ;-)
>
> # cat  /root/hadoop-config/core-site.xml
> <configuration>
>     <property>
>         <name>fs.s3.impl</name>
>         <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
>     </property>
>
>     <property>
>         <name>fs.s3a.buffer.dir</name>
>         <value>/tmp</value>
>     </property>
>
>     <property>
>         <name>fs.s3a.access.key</name>
>         <value>MY_ACCESS_KEY</value>
>     </property>
>
>     <property>
>         <name>fs.s3a.secret.key</name>
>         <value>MY_SECRET_KEY</value>
>     </property>
> </configuration>
>
> The JAR’s aws-java-sdk-1.7.4.jar, hadoop-aws-2.7.4.jar,
> httpclient-4.2.5.jar, httpcore-4.2.5.jar where added to flink/lib/ from
> http://apache.mirror.anlx.net/hadoop/common/hadoop-
> 2.7.4/hadoop-2.7.4.tar.gz
>
> # ls flink/lib/
> aws-java-sdk-1.7.4.jar
> flink-dist_2.11-1.3.1.jar
> flink-python_2.11-1.3.1.jar
> flink-shaded-hadoop2-uber-1.3.1.jar
> hadoop-aws-2.7.4.jar
> httpclient-4.2.5.jar
> httpcore-4.2.5.jar
> log4j-1.2.17.jar
> slf4j-log4j12-1.7.7.jar
>
> I’m using the streaming api, with the following example:
>
> // Set StreamExecutionEnvironment
> final StreamExecutionEnvironment env = StreamExecutionEnvironment.
> getExecutionEnvironment();
>
> // Set checkpoints in ms
> env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
>
> // Add source (input stream)
> DataStream<String> dataStream = StreamUtil.getDataStream(env, params);
>
> // Sink to S3 Bucket
> dataStream.writeAsText("<a href="s3://test-flink/test.txt"
> class="">s3://test-flink/test.txt").setParallelism(1);
>
> pom.xml has the following build dependencies.
>
>         <dependency>
>             <groupId>org.apache.flink</groupId>
>             <artifactId>flink-connector-filesystem_2.10</artifactId>
>             <version>1.3.1</version>
>         </dependency>
>
>         <dependency>
>             <groupId>org.apache.hadoop</groupId>
>             <artifactId>hadoop-aws</artifactId>
>             <version>2.7.2</version>
>         </dependency>
>
>
> Would anybody be able to spare some time to help me resolve my problem?
> I'm sure I’m missing something simple here.
>
> Thanks  :-)
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/No-file-system-found-with-scheme-s3-tp14847.html
> To start a new topic under Apache Flink User Mailing List archive., email
> ml+s2336050n1h83@n4.nabble.com
> To unsubscribe from Apache Flink User Mailing List archive., click here
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmluYXkxOC5wYXRpbEBnbWFpbC5jb218MXwxODExMDE2NjAx>
> .
> NAML
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/No-file-system-found-with-scheme-s3-tp14847p14853.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.