You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Josh Forman-Gornall (JIRA)" <ji...@apache.org> on 2016/06/23 20:12:16 UTC

[jira] [Updated] (FLINK-4115) FsStateBackend filesystem verification can cause classpath exceptions

     [ https://issues.apache.org/jira/browse/FLINK-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Forman-Gornall updated FLINK-4115:
---------------------------------------
    Description: 
In the constructor of FsStateBackend, the FileSystem for the checkpoint directory is initialised and it is verified that the checkpoint path exists. This verification happens in the Flink client program when submitting a job and can cause classpath issues if classes required to access the file system are not available in the client's classpath.

For example, if we run Flink on YARN over AWS EMR using RocksDBStateBackend and an s3:// checkpoint directory, we get the below ClassNotFoundException. This is because the jars needed to use the EMR file system are available only in the YARN context and not when submitting the job via the Flink client.

{noformat}
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
	at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
	at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
	at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
	at org.apache.flink.runtime.state.filesystem.FsStateBackend.validateAndNormalizeUri(FsStateBackend.java:383)
	at org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:175)
	at org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:144)
	at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.<init>(RocksDBStateBackend.java:205)
{noformat}

  was:
In the constructor of FsStateBackend, the FileSystem for the checkpoint directory is initialised and it is verified that the checkpoint path exists. This verification happens in the Flink client program when submitting a job and can cause classpath issues if classes required to access the file system are not available in the client's classpath.

For example, if we run Flink on YARN over AWS EMR using RocksDBStateBackend and an s3:// checkpoint directory, we get the below ClassNotFoundException. This is because the jars needed to use the EMR file system are available only in the YARN context and not when submitting the job via the Flink client.

```
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
	at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
	at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
	at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
	at org.apache.flink.runtime.state.filesystem.FsStateBackend.validateAndNormalizeUri(FsStateBackend.java:383)
	at org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:175)
	at org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:144)
	at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.<init>(RocksDBStateBackend.java:205)
```


> FsStateBackend filesystem verification can cause classpath exceptions
> ---------------------------------------------------------------------
>
>                 Key: FLINK-4115
>                 URL: https://issues.apache.org/jira/browse/FLINK-4115
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.0
>            Reporter: Josh Forman-Gornall
>            Priority: Minor
>
> In the constructor of FsStateBackend, the FileSystem for the checkpoint directory is initialised and it is verified that the checkpoint path exists. This verification happens in the Flink client program when submitting a job and can cause classpath issues if classes required to access the file system are not available in the client's classpath.
> For example, if we run Flink on YARN over AWS EMR using RocksDBStateBackend and an s3:// checkpoint directory, we get the below ClassNotFoundException. This is because the jars needed to use the EMR file system are available only in the YARN context and not when submitting the job via the Flink client.
> {noformat}
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found
> 	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
> 	at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
> 	at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
> 	at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
> 	at org.apache.flink.runtime.state.filesystem.FsStateBackend.validateAndNormalizeUri(FsStateBackend.java:383)
> 	at org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:175)
> 	at org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:144)
> 	at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.<init>(RocksDBStateBackend.java:205)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)