You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2020/11/11 14:18:38 UTC

[GitHub] [nifi] tpalfy opened a new pull request #4655: NIFI-7972 TailFile NFS improvement

tpalfy opened a new pull request #4655:
URL: https://github.com/apache/nifi/pull/4655


   Add a boolean property by which the user can tell the processor to yield (and try again later) whenever it encounters a NUL character.
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [ ] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically `main`)?
   
   - [ ] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on JDK 8?
   - [ ] Have you verified that the full build is successful on JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] tpalfy commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
tpalfy commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522945238



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -610,7 +625,13 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
         }
 
         for (String tailFile : states.keySet()) {
-            processTailFile(context, session, tailFile);
+            try {
+                processTailFile(context, session, tailFile);
+            } catch (NulCharacterEncounteredException e) {
+                getLogger().warn("NUL character encountered in " + tailFile + " and '" + REREAD_ON_NUL.getDisplayName() + "' is set to 'true', yielding.");
+                context.yield();
+                return;

Review comment:
       I'll update the description of the property to note this behaviour.
   If the behaviour itself is deemed to be fine, I'd rather not add a ticket (that would probably wouldn't be taken care of anytime soon).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] tpalfy commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
tpalfy commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522381602



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "

Review comment:
       To me "temporarily replaced" feels a bit better as it conveys the notion of "switching" the value with something else for a time.
   "filled up" better conveys the notion of doing this up to a certain point. I consider the first notion more important but debatable.
   If this comment gets another vote I'll change it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] simonbence commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
simonbence commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522793037



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "

Review comment:
       My only concern about replacement is that implies that there was something else there before. My understanding is the NULs are new content, added as placeholder to the real content will be replaced by the real content. So replacement is more like about NUL and new content.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] markap14 closed pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
markap14 closed pull request #4655:
URL: https://github.com/apache/nifi/pull/4655


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] tpalfy commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
tpalfy commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522383928



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -760,15 +781,26 @@ private void processTailFile(final ProcessContext context, final ProcessSession
 
         final FileChannel fileReader = reader;
         final AtomicLong positionHolder = new AtomicLong(position);
+        Boolean reReadOnNul = context.getProperty(REREAD_ON_NUL).asBoolean();

Review comment:
       As for me, I'd rather remove the "final" keywords from the other ones. All three are effectively final anyway and the many final keywords make the code a bit too bloated in my opinion.
   Let's see if others have a comment on this.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] markap14 commented on pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
markap14 commented on pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#issuecomment-728297905


   Thanks for the fix @tpalfy! Code changes look good to me. Added unit tests help. All existing tests still pass, so I'm a +1. I made a comment in the review. You may well already be familiar with the pattern but it's a pattern that wasn't possible until somewhat recently so i figured i'd mention it. In any case, I approve the changes. Will merged. Thanks! And a big thank you to @simonbence for reviewing as well!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] tpalfy commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
tpalfy commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522385228



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -1365,4 +1406,16 @@ public String toString() {
             return map;
         }
     }
+
+    static class NulCharacterEncounteredException extends RuntimeException {

Review comment:
       This should be a rare occurrence, but a fair point. I can add that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] markap14 commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
markap14 commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r524528057



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -760,15 +782,26 @@ private void processTailFile(final ProcessContext context, final ProcessSession
 
         final FileChannel fileReader = reader;
         final AtomicLong positionHolder = new AtomicLong(position);
+        final Boolean reReadOnNul = context.getProperty(REREAD_ON_NUL).asBoolean();
+
+        AtomicReference<NulCharacterEncounteredException> abort = new AtomicReference<>();

Review comment:
       I think this approach is fine and not worth changing at this point, but just for future reference: when Exceptions might be thrown from a callback, it tends to be cleaner to use the `OutputStream write(FlowFile)` method of `ProcessSession`. So something like:
   ```
   try (final OutputStream rawOut = session.write(flowFile)) {
       ... interact with OutputStream here. Since there's no callback, any Exception that gets thrown here will just propagate outside of the try-with-resources ...
   } catch (NulCharacterEncounteredException e) {
     positionHolder.set(e.getRePos());
     session.remove(flowFile);
     throw e;
   }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] simonbence commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
simonbence commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522794526



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "
+                + "CAUTION! If the file contains legitimate NUL values, setting this flag causes this processor to get stuck indefinitely. "

Review comment:
       It is just a very minor thing and was only about repeating the word "this". What do you think about "setting this flag causes the current processor" (as you phrased in you answer already :) )




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] markap14 commented on pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
markap14 commented on pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#issuecomment-728300269


   I forgot to include the magical "This closes #..." incantation so had to close the PR manually.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] tpalfy commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
tpalfy commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522942979



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "

Review comment:
       I think the current description is fine. 

##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "
+                + "CAUTION! If the file contains legitimate NUL values, setting this flag causes this processor to get stuck indefinitely. "

Review comment:
       I think the current description is fine.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] tpalfy commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
tpalfy commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522376455



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -610,7 +625,13 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
         }
 
         for (String tailFile : states.keySet()) {
-            processTailFile(context, session, tailFile);
+            try {
+                processTailFile(context, session, tailFile);
+            } catch (NulCharacterEncounteredException e) {
+                getLogger().warn("NUL character encountered in " + tailFile + " and '" + REREAD_ON_NUL.getDisplayName() + "' is set to 'true', yielding.");
+                context.yield();
+                return;

Review comment:
       It is intentional. Your suggestion makes sense and would be a bit better but I think the complexity it would require is not worth it. This should be a rare scenario.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] tpalfy commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
tpalfy commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522378310



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "
+                + "CAUTION! If the file contains legitimate NUL values, setting this flag causes this processor to get stuck indefinitely. "

Review comment:
       I intentionally used "this" to emphasize a bit more that what is going to be stuck is the current processor.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] simonbence commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
simonbence commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522798979



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -760,15 +781,26 @@ private void processTailFile(final ProcessContext context, final ProcessSession
 
         final FileChannel fileReader = reader;
         final AtomicLong positionHolder = new AtomicLong(position);
+        Boolean reReadOnNul = context.getProperty(REREAD_ON_NUL).asBoolean();

Review comment:
       I am okay with both direction, my point is to have some uniformity there. Suggesting to add the final is merely from the fact that the original code goes with it, but my main purpose was to have similar style




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] simonbence commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
simonbence commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522797800



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -610,7 +625,13 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
         }
 
         for (String tailFile : states.keySet()) {
-            processTailFile(context, session, tailFile);
+            try {
+                processTailFile(context, session, tailFile);
+            } catch (NulCharacterEncounteredException e) {
+                getLogger().warn("NUL character encountered in " + tailFile + " and '" + REREAD_ON_NUL.getDisplayName() + "' is set to 'true', yielding.");
+                context.yield();
+                return;

Review comment:
       As it might be unexpected for the users, I would mention this in the documentation. Also, if you are not willing to go with it, please create a ticket as a possible further improvement. (It might or might not be covered but I think would be good to not forgot about it)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] tpalfy commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
tpalfy commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522945474



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -760,15 +781,26 @@ private void processTailFile(final ProcessContext context, final ProcessSession
 
         final FileChannel fileReader = reader;
         final AtomicLong positionHolder = new AtomicLong(position);
+        Boolean reReadOnNul = context.getProperty(REREAD_ON_NUL).asBoolean();

Review comment:
       I'll add 'final'.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] simonbence commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
simonbence commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522793037



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "

Review comment:
       My only concern about replacement is that implies that there was something else there before. My understanding is the NULs are new content, added as placeholder to the real content will be replaced by the real content. So replacement is more like about NUL and new content, but I might be rwong




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] simonbence commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
simonbence commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522080154



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "

Review comment:
       Minor: temporarily filled up?

##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -610,7 +625,13 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
         }
 
         for (String tailFile : states.keySet()) {
-            processTailFile(context, session, tailFile);
+            try {
+                processTailFile(context, session, tailFile);
+            } catch (NulCharacterEncounteredException e) {
+                getLogger().warn("NUL character encountered in " + tailFile + " and '" + REREAD_ON_NUL.getDisplayName() + "' is set to 'true', yielding.");
+                context.yield();
+                return;

Review comment:
       This will prevent the other tail files later in the collection from processing. Is this intentional? I would expect the following test to be green:
   
   ```
       @Test
       public void testMultipleFilesWhenOneContainsNul() throws Exception {
           // given
           runner.setProperty(TailFile.BASE_DIRECTORY, "target");
           runner.setProperty(TailFile.REREAD_ON_NUL, "true");
           runner.setProperty(TailFile.FILENAME, "log(ging)?.txt");
           runner.setProperty(TailFile.MODE, TailFile.MODE_MULTIFILE);
   
           final File myOtherFile = new File("target/logging.txt");
           if(myOtherFile.exists()) {
               myOtherFile.delete();
           }
   
           final RandomAccessFile myOtherRaf = new RandomAccessFile(myOtherFile, "rw");
           runner.setProperty(TailFile.START_POSITION, TailFile.START_CURRENT_FILE.getValue());
   
           // The order might be not deterministic
           raf.write("first_line_without_nul\n".getBytes());
           myOtherRaf.write("another_line_with_nul\0\n".getBytes());
   
           // when
           runner.run();
   
           // then
           final List<MockFlowFile> flowFiles = runner.getFlowFilesForRelationship(TailFile.REL_SUCCESS);
           Assert.assertEquals(1, flowFiles.size());
       }
   ```

##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "
+                + "CAUTION! If the file contains legitimate NUL values, setting this flag causes this processor to get stuck indefinitely. "
+                + "For this reason users should refrain from using this feature if they can help it and try to avoid having the target file on a file system where reads are unreliable.")
+            .required(false)
+            .allowableValues("true", "false")
+            .defaultValue("false")

Review comment:
       I think, as it has "defaultValue" it could be required

##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "
+                + "CAUTION! If the file contains legitimate NUL values, setting this flag causes this processor to get stuck indefinitely. "

Review comment:
       Minor: setting this flag causes the processor?

##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -1365,4 +1406,16 @@ public String toString() {
             return map;
         }
     }
+
+    static class NulCharacterEncounteredException extends RuntimeException {

Review comment:
       Just a possible performance improvement: as I understand, this kind of exception should not leave the boundaries of the onTrigger and basically used for flow control. Thus, stack trace will not be used for debug or other purposes. Overriding #fillInStackTrace can prevent collecting the stack trace, which is a costly operation and here looks unnecessary.

##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -760,15 +781,26 @@ private void processTailFile(final ProcessContext context, final ProcessSession
 
         final FileChannel fileReader = reader;
         final AtomicLong positionHolder = new AtomicLong(position);
+        Boolean reReadOnNul = context.getProperty(REREAD_ON_NUL).asBoolean();

Review comment:
       Minor: just for consistency purposes: this could be final as the ones above.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] tpalfy commented on a change in pull request #4655: NIFI-7972 TailFile NFS improvement

Posted by GitBox <gi...@apache.org>.
tpalfy commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522373168



##########
File path: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
             .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
             .build();
 
+    static final PropertyDescriptor REREAD_ON_NUL = new PropertyDescriptor.Builder()
+            .name("reread-on-nul")
+            .displayName("Reread when NUL encountered")
+            .description("If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. "
+                + "The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. "
+                + "NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. "
+                + "CAUTION! If the file contains legitimate NUL values, setting this flag causes this processor to get stuck indefinitely. "
+                + "For this reason users should refrain from using this feature if they can help it and try to avoid having the target file on a file system where reads are unreliable.")
+            .required(false)
+            .allowableValues("true", "false")
+            .defaultValue("false")

Review comment:
       I don't think it should be. It shouldn't be considered a fundamentally important setting.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org