You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2019/09/02 01:37:37 UTC

[GitHub] [nifi] ijokarumawak commented on a change in pull request #3483: NIFI-6275 ListHDFS now ignores scheme and authority when uses "Full P…

ijokarumawak commented on a change in pull request #3483: NIFI-6275 ListHDFS now ignores scheme and authority when uses "Full P…
URL: https://github.com/apache/nifi/pull/3483#discussion_r319786069
 
 

 ##########
 File path: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java
 ##########
 @@ -527,7 +527,7 @@ private PathFilter createPathFilter(final ProcessContext context) {
         return path -> {
             final boolean accepted;
             if (FILTER_FULL_PATH_VALUE.getValue().equals(filterMode)) {
-                accepted = filePattern.matcher(path.toString()).matches();
+                accepted = filePattern.matcher(Path.getPathWithoutSchemeAndAuthority(path).toString()).matches();
 
 Review comment:
   If it's possible that this improvement may break existing user flows, then I'd like to discuss about other approaches to opt-in this.
   
   We can provide different UX via different approaches:
   1. Current approach: If existing flows regex contains schema or authority, their flow will not list files as before. Users may wonder what goes wrong. May not notice the change if they don't read docs..
   2. Adding new 'Filter without Schema and Authority' property:
       - A. If we leave its default value blank and implement a custom validation to require it when filter regex is not empty, then we can make existing ListHDFS invalid. That will give user to chance to review their configuration.
       - B. If we use `false` as default value, existing flows work as is. While this improvement can be opt-in. The most safe approach, but a con is people may forget enabling this option.
   3. Adding new 'Full Path (without schema and authority)' filter mode, or add new one and rename the existing one's display name to 'Full Path (include schema and authority)': this guarantees existing flows work as is, while providing easy configuration UX for new setups.
   
   I personally prefer the option 3 above. How do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services