You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by StephanEwen <gi...@git.apache.org> on 2017/10/04 19:28:35 UTC

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

GitHub user StephanEwen opened a pull request:

    https://github.com/apache/flink/pull/4776

    [FLINK-7643] [core] Rework FileSystem loading to use factories

    ## What is the purpose of the change
    
    This change reworks the loading and instantiation of File System objects (including file systems supported via Hadoop) to use factories. 
    
    This makes sure that configurations (Flink and possibly Hadoop) are loaded once (on TaskManager / JobManager startup) and file system instances are properly reused by scheme and authority. That way, this change 
    
    This change is also a prerequisite for an extensible file system loading mechanism via a service framework.
    
    ## Brief change log
    
      - The special-case configuration of the `FileSystem` class to set the "default file system scheme" is extended to a generic configuration call.
      - The directory of directly supported file systems is changed from classes (instantiated via reflection) to factories.
      - These factories are also configured when the `FileSystem` is configured.
      - The Hadoop file system factory loads the Hadoop configuration once when being configured and applies it to all subsequently instantiated file systems.
      - File systems supported via Hadoop are now properly cached and not reloaded, reinstantiated, and reconfigured on each access.
      - This also throws out a lot of legacy code for how to find Hadoop file system implementations
      - The `FileSystem` class is much cleaner now because a lot of the Hadoop FS
      - All file systems now eagerly initialize their settings, rather than dividing that between the constructor and the `initialize()` method.
      - This also factors out a lot of the special treatment of Hadoop file systems and simply makes the Hadoop File System factory the default fallback factory.
    
    ## Verifying this change
    
    Reworked some tests to cover the behavior of this change:
      - `flink-core/src/test/java/org/apache/flink/configuration/FilesystemSchemeConfigTest.java`
      - `flink-runtime/src/test/java/org/apache/flink/runtime/taskmanager/TaskManagerConfigurationTest.java`
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (yes / **no**)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (**yes** / no)
      - The serializers: (yes / **no** / don't know)
      - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
    
    *Note:* The breaking changes made on `@Public` class `FileSystem` do not include methods that are meant for users, but only the setup configuration.
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (yes / **no**)
      - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink fs_fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4776.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4776
    
----
commit ba312e137c7af1d2c331c5231b5b0ae3e0401549
Author: Stephan Ewen <se...@apache.org>
Date:   2017-10-02T12:34:27Z

    [FLINK-7643] [core] Misc. cleanups in FileSystem
    
      - Simplify access to local file system
      - Use a fair lock for all FileSystem.get() operations
      - Robust falback to local fs for default scheme (avoids URI parsing error on Windows)
      - Deprecate 'getDefaultBlockSize()'
      - Deprecate create(...) with block sizes and replication factor, which is not applicable to many FS

commit 8130d874b8b823f22964f435bf1a1d1bd39774d6
Author: Stephan Ewen <se...@apache.org>
Date:   2017-10-02T14:25:18Z

    [FLINK-7643] [core] Rework FileSystem loading to use factories
    
    This makes sure that configurations are loaded once and file system instances are
    properly reused by scheme and authority.
    
    This also factors out a lot of the special treatment of Hadoop file systems and simply
    makes the Hadoop File System factory the default fallback factory.

commit c652f1322044f9715a0d94fa21ec853769be9a78
Author: Stephan Ewen <se...@apache.org>
Date:   2017-10-02T14:30:07Z

    [FLINK-7643] [core] Drop eager checks for file system support.
    
    Some places validate if the file URIs are resolvable on the client. This leads to
    problems when file systems are not accessible from the client, when the full libraries for
    the file systems are not present on the client (for example often the case in cloud setups),
    or when the configuration on the client is different from the nodes/containers that will
    execute the application.

----


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142989415
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/HadoopFileSystemFactoryLoader.java ---
    @@ -0,0 +1,67 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.core.fs.FileSystemFactory;
    +
    +/**
    + * A
    + */
    +public class HadoopFileSystemFactoryLoader {
    +
    +	private static final String FACTORY_CLASS = "org.apache.flink.runtime.fs.hdfs.HadoopFsFactory";
    +
    +	private static final String HADOOP_CONFIG_CLASS = "org.apache.hadoop.conf.Configuration";
    +
    +	private static final String HADOOP_FS_CLASS = "org.apache.hadoop.fs.FileSystem";
    +
    +
    +	public static FileSystemFactory loadFactory() {
    +		final ClassLoader cl = HadoopFileSystemFactoryLoader.class.getClassLoader();
    +
    +		// first, see if the Flink runtime classes are available
    +		final Class<? extends FileSystemFactory> factoryClass;
    +		try {
    +			factoryClass = Class.forName(FACTORY_CLASS, false, cl).asSubclass(FileSystemFactory.class);
    +		}
    +		catch (ClassNotFoundException e) {
    +			return new UnsupportedSchemeFactory("Flink runtime classes missing in classpath/dependencies.");
    +		}
    +		catch (Exception | LinkageError e) {
    --- End diff --
    
    ditto


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142989725
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/HadoopFileSystemFactoryLoader.java ---
    @@ -0,0 +1,67 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.core.fs.FileSystemFactory;
    +
    +/**
    + * A
    + */
    +public class HadoopFileSystemFactoryLoader {
    +
    +	private static final String FACTORY_CLASS = "org.apache.flink.runtime.fs.hdfs.HadoopFsFactory";
    +
    +	private static final String HADOOP_CONFIG_CLASS = "org.apache.hadoop.conf.Configuration";
    +
    +	private static final String HADOOP_FS_CLASS = "org.apache.hadoop.fs.FileSystem";
    +
    +
    +	public static FileSystemFactory loadFactory() {
    +		final ClassLoader cl = HadoopFileSystemFactoryLoader.class.getClassLoader();
    +
    +		// first, see if the Flink runtime classes are available
    +		final Class<? extends FileSystemFactory> factoryClass;
    +		try {
    +			factoryClass = Class.forName(FACTORY_CLASS, false, cl).asSubclass(FileSystemFactory.class);
    +		}
    +		catch (ClassNotFoundException e) {
    +			return new UnsupportedSchemeFactory("Flink runtime classes missing in classpath/dependencies.");
    +		}
    +		catch (Exception | LinkageError e) {
    +			return new UnsupportedSchemeFactory("Flink's Hadoop file system factory could not be loaded", e);
    +		}
    +
    +		// check (for eager and better exception messages) if the Hadoop classes are available here
    +		try {
    +			Class.forName(HADOOP_CONFIG_CLASS, false, cl);
    +			Class.forName(HADOOP_FS_CLASS, false, cl);
    +		}
    +		catch (ClassNotFoundException e) {
    --- End diff --
    
    ditto


---

[GitHub] flink issue #4776: [FLINK-7643] [core] Rework FileSystem loading to use fact...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/4776
  
    @bowenli86 We need to fail lazily, because Flink should be able to always work without MapR FS or HDFS being in the classpath.
    
    With the change currently, you can start Flink without any Hadoop dependencies and it works fine. It only fails then if you try to use HDFS. Failing everything eager means it always fails when MapR or Hadoop classes are not in the classpath.
    
    BTW: That behavior is the same as right now - I did not change it, just 


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142989384
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/HadoopFileSystemFactoryLoader.java ---
    @@ -0,0 +1,67 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.core.fs.FileSystemFactory;
    +
    +/**
    + * A
    + */
    +public class HadoopFileSystemFactoryLoader {
    +
    +	private static final String FACTORY_CLASS = "org.apache.flink.runtime.fs.hdfs.HadoopFsFactory";
    +
    +	private static final String HADOOP_CONFIG_CLASS = "org.apache.hadoop.conf.Configuration";
    +
    +	private static final String HADOOP_FS_CLASS = "org.apache.hadoop.fs.FileSystem";
    +
    +
    +	public static FileSystemFactory loadFactory() {
    +		final ClassLoader cl = HadoopFileSystemFactoryLoader.class.getClassLoader();
    +
    +		// first, see if the Flink runtime classes are available
    +		final Class<? extends FileSystemFactory> factoryClass;
    +		try {
    +			factoryClass = Class.forName(FACTORY_CLASS, false, cl).asSubclass(FileSystemFactory.class);
    +		}
    +		catch (ClassNotFoundException e) {
    --- End diff --
    
    Shall we log a warning for this exception? I'd prefer to have a log entry at the very top of log file to be easily discovered


---

[GitHub] flink issue #4776: [FLINK-7643] [core] Rework FileSystem loading to use fact...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on the issue:

    https://github.com/apache/flink/pull/4776
  
    ok, sounds good. That also explains some questions I commented, so I removed them.


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r143142386
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java ---
    @@ -328,116 +360,54 @@ public static FileSystem getUnguardedFileSystem(URI uri) throws IOException {
     			final FSKey key = new FSKey(uri.getScheme(), uri.getAuthority());
    --- End diff --
    
    Actually. `authority` already includes the port (authority = host + port).


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142787795
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/HadoopFileSystemFactoryLoader.java ---
    @@ -0,0 +1,67 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.core.fs.FileSystemFactory;
    +
    +/**
    + * A
    + */
    +public class HadoopFileSystemFactoryLoader {
    +
    +	private static final String FACTORY_CLASS = "org.apache.flink.runtime.fs.hdfs.HadoopFsFactory";
    --- End diff --
    
    why not using `HadoopFsFactory.class.getCanonicalName()`?
    



---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142788766
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/MapRFsFactory.java ---
    @@ -0,0 +1,75 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.configuration.Configuration;
    +import org.apache.flink.core.fs.FileSystem;
    +import org.apache.flink.core.fs.FileSystemFactory;
    +
    +import java.io.IOException;
    +import java.lang.reflect.Constructor;
    +import java.lang.reflect.InvocationTargetException;
    +import java.net.URI;
    +
    +/**
    + * A factory for the MapR file system.
    + * 
    + * <p>This factory tries to reflectively instantiate the MapR file system. It can only be
    + * used when the MapR FS libraries are in the classpath.
    + */
    +public class MapRFsFactory implements FileSystemFactory {
    +
    +	private static final String MAPR_FILESYSTEM_CLASS = "org.apache.flink.runtime.fs.maprfs.MapRFileSystem";
    --- End diff --
    
    ditto


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142989807
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/HadoopFileSystemFactoryLoader.java ---
    @@ -0,0 +1,67 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.core.fs.FileSystemFactory;
    +
    +/**
    + * A
    + */
    +public class HadoopFileSystemFactoryLoader {
    +
    +	private static final String FACTORY_CLASS = "org.apache.flink.runtime.fs.hdfs.HadoopFsFactory";
    +
    +	private static final String HADOOP_CONFIG_CLASS = "org.apache.hadoop.conf.Configuration";
    +
    +	private static final String HADOOP_FS_CLASS = "org.apache.hadoop.fs.FileSystem";
    +
    +
    +	public static FileSystemFactory loadFactory() {
    +		final ClassLoader cl = HadoopFileSystemFactoryLoader.class.getClassLoader();
    +
    +		// first, see if the Flink runtime classes are available
    +		final Class<? extends FileSystemFactory> factoryClass;
    +		try {
    +			factoryClass = Class.forName(FACTORY_CLASS, false, cl).asSubclass(FileSystemFactory.class);
    +		}
    +		catch (ClassNotFoundException e) {
    +			return new UnsupportedSchemeFactory("Flink runtime classes missing in classpath/dependencies.");
    +		}
    +		catch (Exception | LinkageError e) {
    +			return new UnsupportedSchemeFactory("Flink's Hadoop file system factory could not be loaded", e);
    +		}
    +
    +		// check (for eager and better exception messages) if the Hadoop classes are available here
    +		try {
    +			Class.forName(HADOOP_CONFIG_CLASS, false, cl);
    +			Class.forName(HADOOP_FS_CLASS, false, cl);
    +		}
    +		catch (ClassNotFoundException e) {
    +			return new UnsupportedSchemeFactory("Hadoop is not in the classpath/dependencies.");
    +		}
    +
    +		// Create the factory.
    +		try {
    +			return factoryClass.newInstance();
    +		}
    +		catch (Exception | LinkageError e) {
    --- End diff --
    
    ditto


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142880460
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java ---
    @@ -328,116 +360,54 @@ public static FileSystem getUnguardedFileSystem(URI uri) throws IOException {
     			final FSKey key = new FSKey(uri.getScheme(), uri.getAuthority());
    --- End diff --
    
    Stephan: Add port here as well if possible, to support multiple separate HDFS installations.


---

[GitHub] flink issue #4776: [FLINK-7643] [core] Rework FileSystem loading to use fact...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/4776
  
    Thanks for the comments. Addressing them and merging them...


---

[GitHub] flink issue #4776: [FLINK-7643] [core] Rework FileSystem loading to use fact...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/4776
  
    @bowenli86 I can add a log statement.


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142787414
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/HadoopFileSystemFactoryLoader.java ---
    @@ -0,0 +1,67 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.core.fs.FileSystemFactory;
    +
    +/**
    + * A
    --- End diff --
    
    A?


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142789177
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/UnsupportedSchemeFactory.java ---
    @@ -0,0 +1,65 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.configuration.Configuration;
    +import org.apache.flink.core.fs.FileSystem;
    +import org.apache.flink.core.fs.FileSystemFactory;
    +import org.apache.flink.core.fs.UnsupportedFileSystemSchemeException;
    +
    +import javax.annotation.Nullable;
    +import java.io.IOException;
    +import java.net.URI;
    +
    +import static org.apache.flink.util.Preconditions.checkNotNull;
    +
    +/**
    + * A file system factory to throw an UnsupportedFileSystemSchemeException when called.
    + */
    +public class UnsupportedSchemeFactory implements FileSystemFactory {
    --- End diff --
    
    I feel this is unnecessary. We can fail Flink faster and earlier, instead of having a wait until `create()` is called


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142881593
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/HadoopFileSystemFactoryLoader.java ---
    @@ -0,0 +1,67 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.core.fs.FileSystemFactory;
    +
    +/**
    + * A
    --- End diff --
    
    yes, the comment needs extension...


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142788553
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/factories/HadoopFileSystemFactoryLoader.java ---
    @@ -0,0 +1,67 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.core.fs.factories;
    +
    +import org.apache.flink.core.fs.FileSystemFactory;
    +
    +/**
    + * A
    + */
    +public class HadoopFileSystemFactoryLoader {
    +
    +	private static final String FACTORY_CLASS = "org.apache.flink.runtime.fs.hdfs.HadoopFsFactory";
    +
    +	private static final String HADOOP_CONFIG_CLASS = "org.apache.hadoop.conf.Configuration";
    +
    +	private static final String HADOOP_FS_CLASS = "org.apache.hadoop.fs.FileSystem";
    +
    +
    +	public static FileSystemFactory loadFactory() {
    +		final ClassLoader cl = HadoopFileSystemFactoryLoader.class.getClassLoader();
    +
    +		// first, see if the Flink runtime classes are available
    +		final Class<? extends FileSystemFactory> factoryClass;
    +		try {
    +			factoryClass = Class.forName(FACTORY_CLASS, false, cl).asSubclass(FileSystemFactory.class);
    +		}
    +		catch (ClassNotFoundException e) {
    +			return new UnsupportedSchemeFactory("Flink runtime classes missing in classpath/dependencies.");
    --- End diff --
    
    why return an unsupported factory which will fail later, rather than fail early here in class loader?


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/4776


---

[GitHub] flink pull request #4776: [FLINK-7643] [core] Rework FileSystem loading to u...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4776#discussion_r142880553
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/FileSystemFactory.java ---
    @@ -15,18 +15,19 @@
      * See the License for the specific language governing permissions and
      * limitations under the License.
      */
    +
     package org.apache.flink.core.fs;
     
     import org.apache.flink.annotation.PublicEvolving;
    +import org.apache.flink.configuration.Configuration;
    +
    +import java.io.IOException;
    +import java.net.URI;
     
     @PublicEvolving
    -public interface HadoopFileSystemWrapper {
    +public interface FileSystemFactory {
    --- End diff --
    
    Missing comments


---