You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "annanys23 (via GitHub)" <gi...@apache.org> on 2023/10/25 00:41:00 UTC

[PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

annanys23 opened a new pull request, #7929:
URL: https://github.com/apache/nifi/pull/7929

   <!-- Licensed to the Apache Software Foundation (ASF) under one or more -->
   <!-- contributor license agreements.  See the NOTICE file distributed with -->
   <!-- this work for additional information regarding copyright ownership. -->
   <!-- The ASF licenses this file to You under the Apache License, Version 2.0 -->
   <!-- (the "License"); you may not use this file except in compliance with -->
   <!-- the License.  You may obtain a copy of the License at -->
   <!--     http://www.apache.org/licenses/LICENSE-2.0 -->
   <!-- Unless required by applicable law or agreed to in writing, software -->
   <!-- distributed under the License is distributed on an "AS IS" BASIS, -->
   <!-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -->
   <!-- See the License for the specific language governing permissions and -->
   <!-- limitations under the License. -->
   
   # Summary
   
   [NIFI-12249](https://issues.apache.org/jira/browse/NIFI-12249)
   
   # Tracking
   
   Please complete the following tracking steps prior to pull request creation.
   
   ### Issue Tracking
   
   - [x] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created
   
   ### Pull Request Tracking
   
   - [x] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-00000`
   - [x] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-00000`
   
   ### Pull Request Formatting
   
   - [x] Pull Request based on current revision of the `main` branch
   - [x] Pull Request refers to a feature branch with one commit containing changes
   
   # Verification
   
   Unit tests updated to ensure failure attribute is set on the flow file.  Some tests were updated to check for specific error conditions cited in the failure reasons written to the attribute.
   
   Also, verified by creating the following flow:
   GenerateFlowFile -> success -> FetchFTP
   FetchFTP -> comms.failure, not.found, permission.denied -> LogAttribute
   FetchFTP -> success -> funnel
   LogAttribute -> success -> funnel
   
   Configure FetchFTP with a bogus Hostname, Username, and Remote File
   Ran the flow the saw the attribute 'failure' get logged:
   
   Key: 'failure'
           Value: 'Failed to fetch content for StandardFlowFileRecord[uuid=a363e000-3c24-429a-8c0b-4e2d12a37345,claim=,offset=0,name=a363e000-3c24-429a-8c0b-4e2d12a37345,size=0] from filename file1 on remote host somehost:21 due to org.apache.nifi.processors.standard.socket.ClientConnectException: FTP Client connection failed [somehost:21]; routing to comms.failure'
   
   
   ### Build
   
   - [x] Build completed using `mvn clean install -P contrib-check`
     - [x] JDK 21
   
   ### Licensing
   
   - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html)
   - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files
   
   ### Documentation
   
   - [x] Documentation formatting appears as expected in rendered files
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "exceptionfactory (via GitHub)" <gi...@apache.org>.
exceptionfactory commented on code in PR #7929:
URL: https://github.com/apache/nifi/pull/7929#discussion_r1374767594


##########
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchFileTransfer.java:
##########
@@ -296,6 +298,18 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
             } else {
                 attributes.put(CoreAttributes.FILENAME.key(), filename);
             }
+
+            if (failureRelationship != null) {
+                attributes.put(FAILURE_REASON_ATTRIBUTE, failureRelationship.getName());
+                flowFile = session.putAllAttributes(flowFile, attributes);
+                session.transfer(session.penalize(flowFile), failureRelationship);
+                if (provenanceEventOnFailure) {
+                    session.getProvenanceReporter().route(flowFile, failureRelationship);
+                }
+                cleanupTransfer(transfer, closeConnOnFailure, transferQueue, host, port);
+                return;
+            }

Review Comment:
   Thanks, that makes sense, and sounds good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "exceptionfactory (via GitHub)" <gi...@apache.org>.
exceptionfactory commented on code in PR #7929:
URL: https://github.com/apache/nifi/pull/7929#discussion_r1374765978


##########
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchFileTransfer.java:
##########
@@ -254,36 +254,38 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
             transfer = transferWrapper.getFileTransfer();
         }
 
+        Relationship failureRelationship = null;
+        boolean closeConnOnFailure = false;
+        boolean provenanceEventOnFailure = false;

Review Comment:
   Reviewing the changes, it looks like this conditional check can be removed. Although the previous failure handling block did not call the Provenance Reporter on communications failure, this should be changed to always call `getProvenanceReporter().route()` for failures.
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "mosermw (via GitHub)" <gi...@apache.org>.
mosermw commented on PR #7929:
URL: https://github.com/apache/nifi/pull/7929#issuecomment-1781272173

   I think the main motivation for requests like this are dataflow managers who don't want a separate UpdateAttribute processor creating a detailed notification message for each of the failure relationships.  "If I have 10 FetchSFTP processors then I need 30 UpdateAttribute processors, when really I only want 1".  At a large scale, little design decisions like this make a difference.
   
   I wonder what the community would think of consolidating the FetchSFTP "comms.failure", "not.found" and "permission.denied" relationships into 1 "failure" relationship and moving forward with an attribute that describes the failure reason?  It would definitely be for NiFi 2.0 since it's a big breaking change. I suppose we would want @markap14 to weigh in on this as the original author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "exceptionfactory (via GitHub)" <gi...@apache.org>.
exceptionfactory commented on code in PR #7929:
URL: https://github.com/apache/nifi/pull/7929#discussion_r1374684426


##########
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchFTP.java:
##########
@@ -47,7 +47,8 @@
     @WritesAttribute(attribute = "ftp.remote.port", description = "The port that was used to communicate with the remote FTP server"),
     @WritesAttribute(attribute = "ftp.remote.filename", description = "The name of the remote file that was pulled"),
     @WritesAttribute(attribute = "filename", description = "The filename is updated to point to the filename fo the remote file"),
-    @WritesAttribute(attribute = "path", description = "If the Remote File contains a directory name, that directory name will be added to the FlowFile using the 'path' attribute")
+    @WritesAttribute(attribute = "path", description = "If the Remote File contains a directory name, that directory name will be added to the FlowFile using the 'path' attribute"),
+    @WritesAttribute(attribute = "failure", description = "If the transfer failed, the failure reason is written to this attribute")

Review Comment:
   This should be changed to reflect the updated attribute name:
   ```suggestion
       @WritesAttribute(attribute = "fetch.failure.reason", description = "The name of the failure relationship applied when routing to any failure relationship")
   ```



##########
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchFileTransfer.java:
##########
@@ -296,6 +298,18 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
             } else {
                 attributes.put(CoreAttributes.FILENAME.key(), filename);
             }
+
+            if (failureRelationship != null) {
+                attributes.put(FAILURE_REASON_ATTRIBUTE, failureRelationship.getName());
+                flowFile = session.putAllAttributes(flowFile, attributes);
+                session.transfer(session.penalize(flowFile), failureRelationship);
+                if (provenanceEventOnFailure) {
+                    session.getProvenanceReporter().route(flowFile, failureRelationship);
+                }
+                cleanupTransfer(transfer, closeConnOnFailure, transferQueue, host, port);
+                return;
+            }

Review Comment:
   Moving this logic to a separate block outside of the catch blocks introduces some potential logic changes, such as the addition of other attributes to the FlowFile, is that intentional? Otherwise it would be less of an impact to apply the attribute value in the appropriate catch blocks.



##########
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchSFTP.java:
##########
@@ -47,7 +47,8 @@
     @WritesAttribute(attribute = "sftp.remote.port", description = "The port that was used to communicate with the remote SFTP server"),
     @WritesAttribute(attribute = "sftp.remote.filename", description = "The name of the remote file that was pulled"),
     @WritesAttribute(attribute = "filename", description = "The filename is updated to point to the filename fo the remote file"),
-    @WritesAttribute(attribute = "path", description = "If the Remote File contains a directory name, that directory name will be added to the FlowFile using the 'path' attribute")
+    @WritesAttribute(attribute = "path", description = "If the Remote File contains a directory name, that directory name will be added to the FlowFile using the 'path' attribute"),
+    @WritesAttribute(attribute = "failure", description = "If the transfer failed, the failure reason is written to this attribute")

Review Comment:
   This should be changed to reflect the updated attribute name:
   ```suggestion
       @WritesAttribute(attribute = "fetch.failure.reason", description = "The name of the failure relationship applied when routing to any failure relationship")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "exceptionfactory (via GitHub)" <gi...@apache.org>.
exceptionfactory commented on code in PR #7929:
URL: https://github.com/apache/nifi/pull/7929#discussion_r1374765978


##########
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchFileTransfer.java:
##########
@@ -254,36 +254,38 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
             transfer = transferWrapper.getFileTransfer();
         }
 
+        Relationship failureRelationship = null;
+        boolean closeConnOnFailure = false;
+        boolean provenanceEventOnFailure = false;

Review Comment:
   Reviewing the changes, it looks like this conditional check can be removed. Although the previous failure handling block did not call the Provenance Reporter on communications failure, this should be changed to always call `getProvenanceReporter().route()` for failures.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "exceptionfactory (via GitHub)" <gi...@apache.org>.
exceptionfactory closed pull request #7929: NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute
URL: https://github.com/apache/nifi/pull/7929


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "mosermw (via GitHub)" <gi...@apache.org>.
mosermw commented on PR #7929:
URL: https://github.com/apache/nifi/pull/7929#issuecomment-1781646495

   You make some very thoughtful points @exceptionfactory and it makes sense to provide fine grained error handling if the underlying library provides different error codes.  I spot checked some other Fetch processors which only have a single failure relationship, and I imagine their libraries don't enable finer grained handling as easily.
   
   How about one last idea before I join you with opposing this?  What if we do not include the log message in the attribute but only include the relationship name in the attribute?  And change the attribute name to 'fetch.failure.reason' which aligns with the FetchParquet processor.  This should enable this PR to be much simpler and easier to maintain.
   
   fetch.failure.reason=comms.failure or fetch.failure.reason=not.found or fetch.failure.reason=permission.denied


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "annanys23 (via GitHub)" <gi...@apache.org>.
annanys23 commented on PR #7929:
URL: https://github.com/apache/nifi/pull/7929#issuecomment-1781856166

   That works for me.  Thanks for the discussion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "exceptionfactory (via GitHub)" <gi...@apache.org>.
exceptionfactory commented on PR #7929:
URL: https://github.com/apache/nifi/pull/7929#issuecomment-1781310538

   I can appreciate the desire to have fewer UpdateAttribute Processors in the case of handling different types of failures. Different Processors take different approaches in terms of failure handling, thinking of InvokeHTTP as one example of multiple categories of relationships.
   
   The value of specific relationships is a clear differentiation of error types. Using FlowFile attributes can be helpful where there are well known status properties, such as HTTP response codes.
   
   SFTP has some error codes, but even communication failures are different than other issues, like file not found or permission denied. One of the main reasons for having different relationships is that some errors could be retried.
   
   From that angle, keeping the separate relationships has value for a number of use cases, even though it might complicate notification uses cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "exceptionfactory (via GitHub)" <gi...@apache.org>.
exceptionfactory commented on PR #7929:
URL: https://github.com/apache/nifi/pull/7929#issuecomment-1781782831

   Thanks for the reply @mosermw.
   
   Introducing a FlowFile attribute that uses the relationship name seems like a good middle way in this scenario. It would support the use case of a single output Processor for failures if desired, providing a clear programmatic way to handle different types of failures.
   
   What do you think @annanys23?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] NIFI-12249 FetchFTP and FetchSFTP processors now write failure reason to 'failure' attribute [nifi]

Posted by "annanys23 (via GitHub)" <gi...@apache.org>.
annanys23 commented on code in PR #7929:
URL: https://github.com/apache/nifi/pull/7929#discussion_r1374757819


##########
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchFileTransfer.java:
##########
@@ -296,6 +298,18 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
             } else {
                 attributes.put(CoreAttributes.FILENAME.key(), filename);
             }
+
+            if (failureRelationship != null) {
+                attributes.put(FAILURE_REASON_ATTRIBUTE, failureRelationship.getName());
+                flowFile = session.putAllAttributes(flowFile, attributes);
+                session.transfer(session.penalize(flowFile), failureRelationship);
+                if (provenanceEventOnFailure) {
+                    session.getProvenanceReporter().route(flowFile, failureRelationship);
+                }
+                cleanupTransfer(transfer, closeConnOnFailure, transferQueue, host, port);
+                return;
+            }

Review Comment:
   Yes, this was intentional.  I would think the other attributes are needed for a follow on processor to have context (ex: generate email notifications with similar content to what is being logged in the catch blocks).  Also, this was meant to remove some redundant code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org