You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by gallenvara <gi...@git.apache.org> on 2015/08/19 05:24:38 UTC

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

GitHub user gallenvara opened a pull request:

    https://github.com/apache/flink/pull/1035

    [FLINK-2077] [core] Rework Path class and add extend support for Windows paths

    The class org.apache.flink.core.fs.Path handles paths for Flink's FileInputFormat and FileOutputFormat. Over time, this class has become quite hard to read and modify.
    This PR does some work on cleaning and refactoring.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gallenvara/flink path_refactor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1035.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1035
    
----
commit 526a50a4ba38ae84574c31a6e67b39ac14649d44
Author: gallenvara <ga...@126.com>
Date:   2015-08-18T10:31:40Z

    [Flink-2077] Clean and refactor the Path class.

commit 4fb1b03f89d1a3a2793e957ff0132ba70bffb3a6
Author: gallenvara <ga...@126.com>
Date:   2015-08-19T01:41:55Z

    [Flink-2077] Clean and refactor the Path class.

commit 380ed45473f51ff2835b7777f8fcaa00e45c7b0b
Author: gallenvara <ga...@126.com>
Date:   2015-08-19T02:46:46Z

    [Flink-2077] Clean and refactor the Path class.

commit b7c770df6b2b255a30b5af5589544bf1597f32f5
Author: gallenvara <ga...@126.com>
Date:   2015-08-19T03:22:41Z

    [FLINK-2077] [core] Rework Path class and add extend support for Windows paths

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1035#issuecomment-133340112
  
    Hi @gallenvara, great to see that more and more new people join the Flink community :-)
    
    IMHO, whether a method is used from within the same class or not is not a good indicator for moving a method out of it. You'll find many classes in Flink which define methods which are not used by itself. It's part of the idea of OOP that the methods you define on a class constitute the interface to it. Thus, it defines what you can do with it. In that sense, `makeQualified` should stay part of the `Path` class, because it allows you to create a qualified path object from another path object.
    
    Moreover, scattering functionality around in some static util classes won't make it easier for others to actually find it. How should people know that there is a class `FileSystemUtil` which allows you to convert paths into qualified paths? It's more natural to look at the methods defined in the `Path` class. Having much code in one class is not per se bad if you structure it clearly and if your methods don't become behemoths it's fine. I also don't think that @fhueske had this in mind when he created the JIRA issue.
    
    Maybe I'm totally wrong here, but I think we should close this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1035#discussion_r37512972
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/Path.java ---
    @@ -430,40 +296,127 @@ public int depth() {
     	}
     
     	/**
    -	 * Returns a qualified path object.
    -	 * 
    -	 * @param fs
    -	 *        the FileSystem that should be used to obtain the current working directory
    -	 * @return the qualified path object
    +	 * Checks if the provided path string is either null or has zero length and throws
    +	 * a {@link IllegalArgumentException} if any of the two conditions apply.
    +	 * In addition, leading and tailing whitespaces are removed.
    +	 *
    +	 * @param path
    +	 *        the path string to be checked
    +	 * @return The checked and trimmed path.
    +	 */
    +	private String checkAndTrimPathArg(String path) {
    +		// disallow construction of a Path from an empty string
    +		if (path == null) {
    +			throw new IllegalArgumentException("Can not create a Path from a null string");
    +		}
    +		path = path.trim();
    +		if (path.length() == 0) {
    +			throw new IllegalArgumentException("Can not create a Path from an empty string");
    +		}
    +		return path;
    +	}
    +
    +	/**
    +	 * Initializes a path object given the scheme, authority and path string.
    +	 *
    +	 * @param scheme
    +	 *        the scheme string.
    +	 * @param authority
    +	 *        the authority string.
    +	 * @param path
    +	 *        the path string.
     	 */
    -	public Path makeQualified(FileSystem fs) {
    -		Path path = this;
    -		if (!isAbsolute()) {
    -			path = new Path(fs.getWorkingDirectory(), this);
    +	private void initialize(String scheme, String authority, String path) {
    +		try {
    +			this.uri = new URI(scheme, authority, normalizePath(path), null, null).normalize();
    +		} catch (URISyntaxException e) {
    +			throw new IllegalArgumentException(e);
     		}
    +	}
     
    -		final URI pathUri = path.toUri();
    -		final URI fsUri = fs.getUri();
    +	/**
    +	 * Normalizes a path string.
    +	 *
    +	 * @param path
    +	 *        the path string to normalize
    +	 * @return the normalized path string
    +	 */
    +	private String normalizePath(String path) {
     
    -		String scheme = pathUri.getScheme();
    -		String authority = pathUri.getAuthority();
    +		// remove leading and tailing whitespaces
    +		path = path.trim();
     
    -		if (scheme != null && (authority != null || fsUri.getAuthority() == null)) {
    -			return path;
    +		// remove consecutive slashes & backslashes
    +		path = path.replace("\\", "/");
    +		path = path.replaceAll("/+", "/");
    --- End diff --
    
    Doesn't this also replaces a single `/` with a `/`. Maybe the regex should be `//+`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1035#issuecomment-132962489
  
    @gallenvara, could you elaborate a little bit on why you've moved the `makeQualified` method into the static `FileSystemUtil` class? What's the benefit of it. Is this the only thing which can be refactored to make the `Path` class more readable?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1035#issuecomment-133364252
  
    A few more comments from my side:
    
      - The class overrides `equals()` but `hashCode()` has been removes. This is a classical bug.
      - The `compareTo()` method was a nice property (one could use Paths as data types and keys). Would be nice to keep it.
      - The main original problem was that the class did not support *mounted* drives on Windows, as are common in environments with shared storage (Microsoft Azure Cloud). I think these mounted drive paths start with `\\` or `//`.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by gallenvara <gi...@git.apache.org>.
Github user gallenvara commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1035#discussion_r37610286
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/Path.java ---
    @@ -430,40 +296,127 @@ public int depth() {
     	}
     
     	/**
    -	 * Returns a qualified path object.
    -	 * 
    -	 * @param fs
    -	 *        the FileSystem that should be used to obtain the current working directory
    -	 * @return the qualified path object
    +	 * Checks if the provided path string is either null or has zero length and throws
    +	 * a {@link IllegalArgumentException} if any of the two conditions apply.
    +	 * In addition, leading and tailing whitespaces are removed.
    +	 *
    +	 * @param path
    +	 *        the path string to be checked
    +	 * @return The checked and trimmed path.
    +	 */
    +	private String checkAndTrimPathArg(String path) {
    +		// disallow construction of a Path from an empty string
    +		if (path == null) {
    +			throw new IllegalArgumentException("Can not create a Path from a null string");
    +		}
    +		path = path.trim();
    +		if (path.length() == 0) {
    +			throw new IllegalArgumentException("Can not create a Path from an empty string");
    +		}
    +		return path;
    +	}
    +
    +	/**
    +	 * Initializes a path object given the scheme, authority and path string.
    +	 *
    +	 * @param scheme
    +	 *        the scheme string.
    +	 * @param authority
    +	 *        the authority string.
    +	 * @param path
    +	 *        the path string.
     	 */
    -	public Path makeQualified(FileSystem fs) {
    -		Path path = this;
    -		if (!isAbsolute()) {
    -			path = new Path(fs.getWorkingDirectory(), this);
    +	private void initialize(String scheme, String authority, String path) {
    +		try {
    +			this.uri = new URI(scheme, authority, normalizePath(path), null, null).normalize();
    +		} catch (URISyntaxException e) {
    +			throw new IllegalArgumentException(e);
     		}
    +	}
     
    -		final URI pathUri = path.toUri();
    -		final URI fsUri = fs.getUri();
    +	/**
    +	 * Normalizes a path string.
    +	 *
    +	 * @param path
    +	 *        the path string to normalize
    +	 * @return the normalized path string
    +	 */
    +	private String normalizePath(String path) {
     
    -		String scheme = pathUri.getScheme();
    -		String authority = pathUri.getAuthority();
    +		// remove leading and tailing whitespaces
    +		path = path.trim();
     
    -		if (scheme != null && (authority != null || fsUri.getAuthority() == null)) {
    -			return path;
    +		// remove consecutive slashes & backslashes
    +		path = path.replace("\\", "/");
    +		path = path.replaceAll("/+", "/");
    --- End diff --
    
    Hi, @tillrohrmann  thanks for your review. But i haven't modified this line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by gallenvara <gi...@git.apache.org>.
Github user gallenvara commented on the pull request:

    https://github.com/apache/flink/pull/1035#issuecomment-133315613
  
    @tillrohrmann  I'm new to Flink. Thanks for your words and if i have any mistake please point it out.
    My idea on this issue is : 
    the ```Path``` class has become quite hard to read and modify. In ```Path``` class, the ```makeQualified``` method hasn't been called once.In order to reduce the amount of ```Path``` code and make the class much more thinner ,i have moved it to the util named ```FileSystemUtil``` as a tool. Also, i think the ```Path``` class handles the input and output's path.The ```makeQualified``` method handles the FileSystem . I think they are parellel.
    So i create the ```FileSystemUtil``` as a tool instead of being a method in ```Path``` class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by gallenvara <gi...@git.apache.org>.
Github user gallenvara closed the pull request at:

    https://github.com/apache/flink/pull/1035


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by gallenvara <gi...@git.apache.org>.
Github user gallenvara commented on the pull request:

    https://github.com/apache/flink/pull/1035#issuecomment-132482759
  
    Hi, @zentol .Thanks for your review.
    Remove two methods : hashCode() and compareTo()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the pull request:

    https://github.com/apache/flink/pull/1035#issuecomment-132481210
  
    The changes i see are:
    * removed hashCode()
    * moved makeQualified to a new file as a static method
    * reordered the remaining methods in Path
    
    is that about right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1035#discussion_r37616089
  
    --- Diff: flink-core/src/main/java/org/apache/flink/core/fs/Path.java ---
    @@ -430,40 +296,127 @@ public int depth() {
     	}
     
     	/**
    -	 * Returns a qualified path object.
    -	 * 
    -	 * @param fs
    -	 *        the FileSystem that should be used to obtain the current working directory
    -	 * @return the qualified path object
    +	 * Checks if the provided path string is either null or has zero length and throws
    +	 * a {@link IllegalArgumentException} if any of the two conditions apply.
    +	 * In addition, leading and tailing whitespaces are removed.
    +	 *
    +	 * @param path
    +	 *        the path string to be checked
    +	 * @return The checked and trimmed path.
    +	 */
    +	private String checkAndTrimPathArg(String path) {
    +		// disallow construction of a Path from an empty string
    +		if (path == null) {
    +			throw new IllegalArgumentException("Can not create a Path from a null string");
    +		}
    +		path = path.trim();
    +		if (path.length() == 0) {
    +			throw new IllegalArgumentException("Can not create a Path from an empty string");
    +		}
    +		return path;
    +	}
    +
    +	/**
    +	 * Initializes a path object given the scheme, authority and path string.
    +	 *
    +	 * @param scheme
    +	 *        the scheme string.
    +	 * @param authority
    +	 *        the authority string.
    +	 * @param path
    +	 *        the path string.
     	 */
    -	public Path makeQualified(FileSystem fs) {
    -		Path path = this;
    -		if (!isAbsolute()) {
    -			path = new Path(fs.getWorkingDirectory(), this);
    +	private void initialize(String scheme, String authority, String path) {
    +		try {
    +			this.uri = new URI(scheme, authority, normalizePath(path), null, null).normalize();
    +		} catch (URISyntaxException e) {
    +			throw new IllegalArgumentException(e);
     		}
    +	}
     
    -		final URI pathUri = path.toUri();
    -		final URI fsUri = fs.getUri();
    +	/**
    +	 * Normalizes a path string.
    +	 *
    +	 * @param path
    +	 *        the path string to normalize
    +	 * @return the normalized path string
    +	 */
    +	private String normalizePath(String path) {
     
    -		String scheme = pathUri.getScheme();
    -		String authority = pathUri.getAuthority();
    +		// remove leading and tailing whitespaces
    +		path = path.trim();
     
    -		if (scheme != null && (authority != null || fsUri.getAuthority() == null)) {
    -			return path;
    +		// remove consecutive slashes & backslashes
    +		path = path.replace("\\", "/");
    +		path = path.replaceAll("/+", "/");
    --- End diff --
    
    It's not about who has modified this line or not, it's about correctness and I think this line is wrong. Thus, it would be good if you corrected it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---