You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2022/12/17 22:46:20 UTC

[GitHub] [nifi] turcsanyip commented on a diff in pull request #6792: NIFI-10884 Conflict resolution in PutAzureDataLakeStorage should log the target filename

turcsanyip commented on code in PR #6792:
URL: https://github.com/apache/nifi/pull/6792#discussion_r1051492496


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/PutAzureDataLakeStorage.java:
##########
@@ -212,25 +212,38 @@ static void uploadContent(DataLakeFileClient fileClient, InputStream in, long le
         fileClient.flush(length, true);
     }
 
-    //Visible for testing
+    /**
+     * This method serves as a "commit" for the upload process. To support various Conflict Resolution Strategies the processor uploads
+     * the content of the FlowFile to a temporary file with a unique name, then attempts to rename it. It is not an efficient approach,
+     * especially for large files, but it is needed because of the issue (azure-sdk-for-java/issues/31248) linked above.

Review Comment:
   The temporary file + rename was needed because the "put" in ADLS is not atomic. You create the file first (0-byte file), then append the payload. So the work-in-progress file would be available for readers before the full upload is finished.
   So it is not strictly related to conflict resolution and has nothing to do with issue 31248 (which is about chunked uploading of large files).
   
   Could you please correct the documentation accordingly?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org