You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2022/08/09 16:10:46 UTC

[GitHub] [nifi] exceptionfactory commented on a diff in pull request #6159: NIFI-8248 Modify PutAzureDataLakeStorage processor to use temp file i…

exceptionfactory commented on code in PR #6159:
URL: https://github.com/apache/nifi/pull/6159#discussion_r941502092


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/PutAzureDataLakeStorage.java:
##########
@@ -20,7 +20,9 @@
 import com.azure.storage.file.datalake.DataLakeFileClient;
 import com.azure.storage.file.datalake.DataLakeFileSystemClient;
 import com.azure.storage.file.datalake.DataLakeServiceClient;
+import com.azure.storage.file.datalake.models.DataLakeRequestConditions;
 import com.azure.storage.file.datalake.models.DataLakeStorageException;
+import com.google.common.annotations.VisibleForTesting;

Review Comment:
   Use of Google Guava annotations should be avoided.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/PutAzureDataLakeStorage.java:
##########
@@ -190,4 +218,34 @@ static void uploadContent(DataLakeFileClient fileClient, InputStream in, long le
 
         fileClient.flush(length);
     }
+
+    @VisibleForTesting
+    DataLakeFileClient renameFile(final String fileName, final String directoryPath, final DataLakeFileClient fileClient, final boolean overwrite) {
+        try {
+            final DataLakeRequestConditions destinationCondition = new DataLakeRequestConditions();
+            if (!overwrite) {
+                destinationCondition.setIfNoneMatch("*");
+            }
+            final String destinationPath = createPath(directoryPath, fileName);
+            return fileClient.renameWithResponse(null, destinationPath, null, destinationCondition, null, null).getValue();
+        } catch (DataLakeStorageException dataLakeStorageException) {
+            getLogger().error("Error while renaming temp file " + fileClient.getFileName() + " on Azure Data Lake Storage", dataLakeStorageException);
+            removeTempFile(fileClient);
+            throw dataLakeStorageException;
+        }
+    }
+
+    private String createPath(final String baseDirectory, final String path) {
+        return StringUtils.isNotBlank(baseDirectory)
+                ? baseDirectory + "/" + path
+                : path;
+    }
+
+    private void removeTempFile(final DataLakeFileClient fileClient) {
+        try {
+            fileClient.delete();
+        } catch (Exception e) {
+            getLogger().error("Error while removing temp file " + fileClient.getFileName() + " on Azure Data Lake Storage", e);

Review Comment:
   Log messages should use placeholders instead of string concatenation.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureDataLakeStorage.java:
##########
@@ -129,13 +129,24 @@ public class ListAzureDataLakeStorage extends AbstractListAzureProcessor<ADLSFil
             .expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY)
             .build();
 
+    public static final PropertyDescriptor SHOW_TEMPORARY_FILES = new PropertyDescriptor.Builder()
+            .name("show-temporary-files")
+            .displayName("Show temporary files")
+            .description("Whether temporary files, created during nifi upload process should be shown." +
+                    "These files are incomplete and removed after a completed upload process.")

Review Comment:
   Recommend the following adjustments:
   ```suggestion
               .description("Whether to include temporary files when listing the contents of configured directory paths.")
   ```



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/PutAzureDataLakeStorage.java:
##########
@@ -83,11 +89,23 @@ public class PutAzureDataLakeStorage extends AbstractAzureDataLakeStorageProcess
             .allowableValues(FAIL_RESOLUTION, REPLACE_RESOLUTION, IGNORE_RESOLUTION)
             .build();
 
+    public static final PropertyDescriptor TEMP_FILE_DIRECTORY_PATH = new PropertyDescriptor.Builder()
+            .name("temp-file-directory-path")
+            .displayName("Temp File Directory Path")

Review Comment:
   The combination of `File`, `Directory`, and `Path` is a bit hard to understand. It seems like this should be named something like `Temporary Directory`, or perhaps `Base Temporary Path`. If I understand the implementation correctly, this path is the prefix for the static hard-coded `TEMP_FILE_DIRECTORY`, is that correct? It would be helpful to clarify the wording.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureDataLakeStorage.java:
##########
@@ -129,13 +129,24 @@ public class ListAzureDataLakeStorage extends AbstractListAzureProcessor<ADLSFil
             .expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY)
             .build();
 
+    public static final PropertyDescriptor SHOW_TEMPORARY_FILES = new PropertyDescriptor.Builder()
+            .name("show-temporary-files")
+            .displayName("Show temporary files")

Review Comment:
   The word `Show` does not seem to fit well. What about `Include Temporary Files`?



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/test/java/org/apache/nifi/processors/azure/storage/ITListAzureDataLakeStorage.java:
##########
@@ -57,6 +58,10 @@ public void setUp() {
         uploadFile(testFile1);
         testFiles.put(testFile1.getFilePath(), testFile1);
 
+        TestFile testTempFile1 = new TestFile("_$azuretempdirectory$", "1234file1");

Review Comment:
   All references to `$azuretempdirectory$` should be replaced with String.format() and use of the shared static value for easier maintenance.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/PutAzureDataLakeStorage.java:
##########
@@ -190,4 +218,34 @@ static void uploadContent(DataLakeFileClient fileClient, InputStream in, long le
 
         fileClient.flush(length);
     }
+
+    @VisibleForTesting
+    DataLakeFileClient renameFile(final String fileName, final String directoryPath, final DataLakeFileClient fileClient, final boolean overwrite) {
+        try {
+            final DataLakeRequestConditions destinationCondition = new DataLakeRequestConditions();
+            if (!overwrite) {
+                destinationCondition.setIfNoneMatch("*");
+            }
+            final String destinationPath = createPath(directoryPath, fileName);
+            return fileClient.renameWithResponse(null, destinationPath, null, destinationCondition, null, null).getValue();
+        } catch (DataLakeStorageException dataLakeStorageException) {
+            getLogger().error("Error while renaming temp file " + fileClient.getFileName() + " on Azure Data Lake Storage", dataLakeStorageException);

Review Comment:
   It seems unnecessary to indicate `Azure Data Lake Storage` given the name of the Processor. Placeholders should be used instead of string concatenation.
   ```suggestion
               getLogger().error("Renaming File [{}] failed", fileClient.getFileName(), dataLakeStorageException);
   ```



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureDataLakeStorage.java:
##########
@@ -261,10 +272,12 @@ private List<ADLSFileInfo> performListing(final ProcessContext context, final Lo
             options.setRecursive(recurseSubdirectories);
 
             final Pattern baseDirectoryPattern = Pattern.compile("^" + baseDirectory + "/?");
+            final boolean includeTempFiles = context.getProperty(SHOW_TEMPORARY_FILES).asBoolean();
             final long minimumTimestamp = minTimestamp == null ? 0 : minTimestamp;
 
             final List<ADLSFileInfo> listing = fileSystemClient.listPaths(options, null).stream()
                     .filter(pathItem -> !pathItem.isDirectory())
+                    .filter(pathItem -> includeTempFiles || !pathItem.getName().contains("_$azuretempdirectory$"))

Review Comment:
   The `contains()` call should reference the TEMP_FILE_DIRECTORY variable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org