You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2022/04/07 20:08:33 UTC

[GitHub] [nifi] markap14 opened a new pull request, #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

markap14 opened a new pull request, #5944:
URL: https://github.com/apache/nifi/pull/5944

   … best practices and cleaned up code a bit. Fixed several integration tests.
   
   <!--
     Licensed to the Apache Software Foundation (ASF) under one or more
     contributor license agreements.  See the NOTICE file distributed with
     this work for additional information regarding copyright ownership.
     The ASF licenses this file to You under the Apache License, Version 2.0
     (the "License"); you may not use this file except in compliance with
     the License.  You may obtain a copy of the License at
         http://www.apache.org/licenses/LICENSE-2.0
     Unless required by applicable law or agreed to in writing, software
     distributed under the License is distributed on an "AS IS" BASIS,
     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     See the License for the specific language governing permissions and
     limitations under the License.
   -->
   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
   #### Description of PR
   
   _Enables X functionality; fixes bug NIFI-YYYY._
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [ ] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically `main`)?
   
   - [ ] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on JDK 8?
   - [ ] Have you verified that the full build is successful on JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] markap14 commented on a diff in pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
markap14 commented on code in PR #5944:
URL: https://github.com/apache/nifi/pull/5944#discussion_r876025670


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/PutAzureBlobStorage_v12.java:
##########
@@ -92,10 +97,10 @@ public class PutAzureBlobStorage_v12 extends AbstractAzureBlobProcessor_v12 {
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            STORAGE_CREDENTIALS_SERVICE,
-            AzureStorageUtils.CONTAINER,
-            CREATE_CONTAINER,
-            BLOB_NAME
+        AzureStorageUtils.CONTAINER,
+        BLOB_NAME,
+        CREATE_CONTAINER,
+        STORAGE_CREDENTIALS_SERVICE

Review Comment:
   In reviewing this now after a while, I would actually argue that it makes more sense to push the CREATE_CONTAINER lower in the list. When I decide I want to push data to Azure Blob Storage, the first thing I want to configure is the container to push it in and the name of the object. Then I want to configure authentication. Whether or not the container should be auto-created is of secondary importance.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] markap14 commented on pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
markap14 commented on PR #5944:
URL: https://github.com/apache/nifi/pull/5944#issuecomment-1094044248

   Great catches @jfrazee and @gresockj . You are both indeed right - I did overlook the necessity of including the `path` attribute. Will update the Fetch processors to use `${path}/${filename}` and ensure that the List processors are properly populating the `path` attribute.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] github-actions[bot] closed pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…
URL: https://github.com/apache/nifi/pull/5944


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] turcsanyip commented on a diff in pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
turcsanyip commented on code in PR #5944:
URL: https://github.com/apache/nifi/pull/5944#discussion_r880634629


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureBlobStorage_v12.java:
##########
@@ -131,14 +132,14 @@ public class ListAzureBlobStorage_v12 extends AbstractListProcessor<BlobInfo> {
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            STORAGE_CREDENTIALS_SERVICE,
-            CONTAINER,
-            BLOB_NAME_PREFIX,
-            RECORD_WRITER,
-            LISTING_STRATEGY,
-            TRACKING_STATE_CACHE,
-            TRACKING_TIME_WINDOW,
-            INITIAL_LISTING_TARGET
+        CONTAINER,
+        LISTING_STRATEGY,
+        RECORD_WRITER,
+        STORAGE_CREDENTIALS_SERVICE,
+        BLOB_NAME_PREFIX,
+        TRACKING_STATE_CACHE,
+        TRACKING_TIME_WINDOW,
+        INITIAL_LISTING_TARGET

Review Comment:
   One more thing to note here: `TRACKING_STATE_CACHE`, `TRACKING_TIME_WINDOW` and `INITIAL_LISTING_TARGET` properties depend on `LISTING_STRATEGY` so I believe the best place for them would be immediately after `LISTING_STRATEGY`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] jfrazee commented on pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
jfrazee commented on PR #5944:
URL: https://github.com/apache/nifi/pull/5944#issuecomment-1093284125

   > Overall, great updates, @markap14! Using the processors felt much more internally consistent now.
   > 
   > I noted that while `ListAzureBlobStorage` does set the `filename` attribute, it doesn't set it to a value that can be directly used by `FetchAzureBlobStorage`. Instead, the `azure.blobname` contains what would be the usable value. For example, `filename` contained `test.txt` and `azure.blobname` contained `resources/test.txt` in my test, and it's the latter that is needed by `FetchAzureBlobStorage`. Additionally, `path` was set to `./`, which I think is not consistent. What do you think about setting `path` to the value before the filename, and defaulting the `FetchAzureBlobStorage` `Blob Name` property to `${path}/${filename}`? Not sure how this would work at the root level, however.
   > 
   > Same issue for the `*_12` processors. I have not exercised the ADLS processors yet.
   
   Oops. Overlapping replies.
   
   `${path}/${filename}` is not inconsistent with some of the other Fetch processors so I think it's a reasonable default. The behavior of `${path}` is also weird enough to call broken so it'd seem ok to change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] turcsanyip commented on a diff in pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
turcsanyip commented on code in PR #5944:
URL: https://github.com/apache/nifi/pull/5944#discussion_r875193626


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/PutAzureBlobStorage.java:
##########
@@ -93,6 +92,21 @@ public class PutAzureBlobStorage extends AbstractAzureBlobProcessor {
                   "will fail if the container does not exist.")
             .build();
 
+    private static final List<PropertyDescriptor> properties = Collections.unmodifiableList(Arrays.asList(

Review Comment:
   Upper case names should be used for `static` `final` fields => `PROPERTIES`



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/DeleteAzureBlobStorage.java:
##########
@@ -64,24 +64,32 @@ public class DeleteAzureBlobStorage extends AbstractAzureBlobProcessor {
             .required(true)
             .build();
 
+    private static final List<PropertyDescriptor> properties = Collections.unmodifiableList(Arrays.asList(
+        AzureStorageUtils.CONTAINER_WITH_DEFAULT_VALUE,
+        BLOB,
+        AzureStorageUtils.STORAGE_CREDENTIALS_SERVICE,
+        AzureStorageUtils.ACCOUNT_NAME,
+        AzureStorageUtils.ACCOUNT_KEY,
+        AzureStorageUtils.PROP_SAS_TOKEN,
+        AzureStorageUtils.ENDPOINT_SUFFIX,
+        AzureStorageUtils.PROXY_CONFIGURATION_SERVICE,
+        DELETE_SNAPSHOTS_OPTION));

Review Comment:
   I would consider to move this property up, just after BLOB because it is related to the deleted blob.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/utils/AzureStorageUtils.java:
##########
@@ -271,29 +275,32 @@ public static AzureStorageCredentialsDetails createStorageCredentialsDetails(Pro
     public static Collection<ValidationResult> validateCredentialProperties(ValidationContext validationContext) {
         final List<ValidationResult> results = new ArrayList<>();
 
-        final String storageCredentials = validationContext.getProperty(STORAGE_CREDENTIALS_SERVICE).getValue();
-        final String accountName = validationContext.getProperty(ACCOUNT_NAME).getValue();
-        final String accountKey = validationContext.getProperty(ACCOUNT_KEY).getValue();
-        final String sasToken = validationContext.getProperty(PROP_SAS_TOKEN).getValue();
-        final String endpointSuffix = validationContext.getProperty(ENDPOINT_SUFFIX).getValue();
+        final boolean credentialsSet = validationContext.getProperty(STORAGE_CREDENTIALS_SERVICE).isSet();
+        final boolean accountNameSet = StringUtils.isNotBlank(validationContext.getProperty(ACCOUNT_NAME).getValue());
+        final boolean accountKeySet = StringUtils.isNotBlank(validationContext.getProperty(ACCOUNT_KEY).getValue());
+        final boolean sasTokenSet = StringUtils.isNotBlank(validationContext.getProperty(PROP_SAS_TOKEN).getValue());
+        final boolean endpointSuffixSet = StringUtils.isNotBlank(validationContext.getProperty(ENDPOINT_SUFFIX).getValue());
 
-        if (!((StringUtils.isNotBlank(storageCredentials) && StringUtils.isBlank(accountName) && StringUtils.isBlank(accountKey) && StringUtils.isBlank(sasToken))
-                || (StringUtils.isBlank(storageCredentials) && StringUtils.isNotBlank(accountName) && StringUtils.isNotBlank(accountKey) && StringUtils.isBlank(sasToken))
-                || (StringUtils.isBlank(storageCredentials) && StringUtils.isNotBlank(accountName) && StringUtils.isBlank(accountKey) && StringUtils.isNotBlank(sasToken)))) {
+        final boolean onlyCredentialsServiceSet = credentialsSet && !accountNameSet && !accountKeySet && !sasTokenSet;
+        final boolean accountNameAndKeySet = !credentialsSet && !sasTokenSet && accountNameSet && accountKeySet;
+        final boolean accountNameAndSasTokenSet = !credentialsSet && !accountKeySet && accountNameSet && sasTokenSet;

Review Comment:
   Following the naming concept of `onlyCredentialsServiceSet`, these variables could be named `onlyAccountNameAndKeySet` and `onlyAccountNameAndSasTokenSet`.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/PutAzureBlobStorage.java:
##########
@@ -93,6 +92,21 @@ public class PutAzureBlobStorage extends AbstractAzureBlobProcessor {
                   "will fail if the container does not exist.")
             .build();
 
+    private static final List<PropertyDescriptor> properties = Collections.unmodifiableList(Arrays.asList(
+        AzureStorageUtils.CONTAINER,
+        BLOB_NAME,
+        CREATE_CONTAINER,
+        AzureStorageUtils.STORAGE_CREDENTIALS_SERVICE,
+        AzureStorageUtils.ACCOUNT_NAME,
+        AzureStorageUtils.ACCOUNT_KEY,
+        AzureStorageUtils.PROP_SAS_TOKEN,
+        AzureStorageUtils.ENDPOINT_SUFFIX,
+        AzureStorageUtils.PROXY_CONFIGURATION_SERVICE,
+        AzureBlobClientSideEncryptionUtils.CSE_KEY_TYPE,
+        AzureBlobClientSideEncryptionUtils.CSE_KEY_ID,
+        AzureBlobClientSideEncryptionUtils.CSE_SYMMETRIC_KEY_HEX

Review Comment:
   I would consider to move these properties up, just after `CREATE_CONTAINER`. They belong to the "upload process / data", similar to `CREATE_CONTAINER`.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureBlobStorage.java:
##########
@@ -140,6 +144,15 @@ protected Map<String, String> createAttributes(BlobInfo entity, ProcessContext c
         attributes.put("mime.type", entity.getContentType());
         attributes.put("lang", entity.getContentLanguage());
 
+        final String entityName = entity.getBlobName();
+        if (entityName.contains("/")) {
+            attributes.put(CoreAttributes.FILENAME.key(), StringUtils.substringAfterLast(entityName, "/"));
+            attributes.put(CoreAttributes.PATH.key(), StringUtils.substringBeforeLast(entityName, "/"));
+        } else {
+            attributes.put(CoreAttributes.FILENAME.key(), entityName);
+            attributes.put(CoreAttributes.PATH.key(), ".");

Review Comment:
   The blob is in the "root directory" in this case. Shouldn't it be `'/'` instead?



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureBlobStorage.java:
##########
@@ -99,20 +103,20 @@ public class ListAzureBlobStorage extends AbstractListProcessor<BlobInfo> {
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            LISTING_STRATEGY,
-            AbstractListProcessor.RECORD_WRITER,
-            AzureStorageUtils.CONTAINER,
-            AzureStorageUtils.STORAGE_CREDENTIALS_SERVICE,
-            AzureStorageUtils.ACCOUNT_NAME,
-            AzureStorageUtils.ACCOUNT_KEY,
-            AzureStorageUtils.PROP_SAS_TOKEN,
-            AzureStorageUtils.ENDPOINT_SUFFIX,
-            PROP_PREFIX,
-            AzureStorageUtils.PROXY_CONFIGURATION_SERVICE,
-            ListedEntityTracker.TRACKING_STATE_CACHE,
-            ListedEntityTracker.TRACKING_TIME_WINDOW,
-            ListedEntityTracker.INITIAL_LISTING_TARGET
-            ));
+        AzureStorageUtils.CONTAINER,
+        LISTING_STRATEGY,
+        AbstractListProcessor.RECORD_WRITER,
+        AzureStorageUtils.STORAGE_CREDENTIALS_SERVICE,
+        AzureStorageUtils.ACCOUNT_NAME,
+        AzureStorageUtils.ACCOUNT_KEY,
+        AzureStorageUtils.PROP_SAS_TOKEN,
+        AzureStorageUtils.ENDPOINT_SUFFIX,
+        PROP_PREFIX,
+        AzureStorageUtils.PROXY_CONFIGURATION_SERVICE,
+        ListedEntityTracker.TRACKING_STATE_CACHE,
+        ListedEntityTracker.TRACKING_TIME_WINDOW,
+        ListedEntityTracker.INITIAL_LISTING_TARGET

Review Comment:
   I think property ordering could be improved further:
   - `PROP_PREFIX`: just after `CONTAINER`
   - `ListedEntityTracker.*`: following LISTING_STRATEGY 



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/FetchAzureBlobStorage.java:
##########
@@ -85,6 +79,22 @@ public class FetchAzureBlobStorage extends AbstractAzureBlobProcessor {
             .required(false)
             .build();
 
+    private static final List<PropertyDescriptor> properties = Collections.unmodifiableList(Arrays.asList(
+        AzureStorageUtils.CONTAINER_WITH_DEFAULT_VALUE,
+        BLOB,
+        AzureStorageUtils.STORAGE_CREDENTIALS_SERVICE,
+        AzureStorageUtils.ACCOUNT_NAME,
+        AzureStorageUtils.ACCOUNT_KEY,
+        AzureStorageUtils.PROP_SAS_TOKEN,
+        AzureStorageUtils.ENDPOINT_SUFFIX,
+        RANGE_START,
+        RANGE_LENGTH,
+        AzureBlobClientSideEncryptionUtils.CSE_KEY_TYPE,
+        AzureBlobClientSideEncryptionUtils.CSE_KEY_ID,
+        AzureBlobClientSideEncryptionUtils.CSE_SYMMETRIC_KEY_HEX,

Review Comment:
   I would consider to move these properties up, just after `BLOB`. They are related to the blob being fetched.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureBlobStorage_v12.java:
##########
@@ -131,14 +132,14 @@ public class ListAzureBlobStorage_v12 extends AbstractListProcessor<BlobInfo> {
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            STORAGE_CREDENTIALS_SERVICE,
-            CONTAINER,
-            BLOB_NAME_PREFIX,
-            RECORD_WRITER,
-            LISTING_STRATEGY,
-            TRACKING_STATE_CACHE,
-            TRACKING_TIME_WINDOW,
-            INITIAL_LISTING_TARGET
+        CONTAINER,
+        LISTING_STRATEGY,
+        RECORD_WRITER,
+        STORAGE_CREDENTIALS_SERVICE,
+        BLOB_NAME_PREFIX,
+        TRACKING_STATE_CACHE,
+        TRACKING_TIME_WINDOW,
+        INITIAL_LISTING_TARGET

Review Comment:
   In my opinion, the original ordering was better.
   The concept was:
   1. credential
   2. what to list (container and prefix)
   3. listing strategy and its related properties
   
   We can change the order of these groups (e.g. move the credential down, though I always set it first) but I would not split [container + prefix] and [listing strategy + related properties].



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/FetchAzureBlobStorage_v12.java:
##########
@@ -109,11 +100,11 @@ public class FetchAzureBlobStorage_v12 extends AbstractAzureBlobProcessor_v12 {
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            STORAGE_CREDENTIALS_SERVICE,
-            CONTAINER,
-            BLOB_NAME,
-            RANGE_START,
-            RANGE_LENGTH
+        CONTAINER,
+        BLOB_NAME,
+        STORAGE_CREDENTIALS_SERVICE,
+        RANGE_START,
+        RANGE_LENGTH

Review Comment:
   Similar to List / Put, the concept was:
   - credential first
   - container
   - blob (and related)
   
   The credential property can be moved down but I would not split the blob and range properties because they belong together.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/PutAzureBlobStorage_v12.java:
##########
@@ -92,10 +97,10 @@ public class PutAzureBlobStorage_v12 extends AbstractAzureBlobProcessor_v12 {
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            STORAGE_CREDENTIALS_SERVICE,
-            AzureStorageUtils.CONTAINER,
-            CREATE_CONTAINER,
-            BLOB_NAME
+        AzureStorageUtils.CONTAINER,
+        BLOB_NAME,
+        CREATE_CONTAINER,
+        STORAGE_CREDENTIALS_SERVICE

Review Comment:
   Similar to the List processor, the original concept was:
   - credential first
   - container
   - blob
   
   I would rather not split `CONTAINER` + `CREATE_CONTAINER`.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureDataLakeStorage.java:
##########
@@ -123,33 +122,34 @@ public class ListAzureDataLakeStorage extends AbstractListProcessor<ADLSFileInfo
     public static final PropertyDescriptor PATH_FILTER = new PropertyDescriptor.Builder()
             .name("path-filter")
             .displayName("Path Filter")
-            .description(String.format("When '%s' is true, then only subdirectories whose paths match the given regular expression will be scanned", RECURSE_SUBDIRECTORIES.getDisplayName()))
+            .description("Only subdirectories whose paths match the given regular expression will be scanned")
             .required(false)
             .addValidator(StandardValidators.REGULAR_EXPRESSION_WITH_EL_VALIDATOR)
             .expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY)
+            .dependsOn(RECURSE_SUBDIRECTORIES, "true")
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            ADLS_CREDENTIALS_SERVICE,
-            FILESYSTEM,
-            DIRECTORY,
-            RECURSE_SUBDIRECTORIES,
-            FILE_FILTER,
-            PATH_FILTER,
-            RECORD_WRITER,
-            LISTING_STRATEGY,
-            TRACKING_STATE_CACHE,
-            TRACKING_TIME_WINDOW,
-            INITIAL_LISTING_TARGET));
+        FILESYSTEM,
+        DIRECTORY,
+        ADLS_CREDENTIALS_SERVICE,
+        FILE_FILTER,
+        RECURSE_SUBDIRECTORIES,
+        PATH_FILTER,

Review Comment:
   I believe `DIRECTORY` and `RECURSE_SUBDIRECTORIES` belong together and should not be split. Also `FILE_FILTER` and `PATH_FILTER`.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/FetchAzureDataLakeStorage.java:
##########
@@ -78,12 +79,18 @@ public class FetchAzureDataLakeStorage extends AbstractAzureDataLakeStorageProce
             .defaultValue("0")
             .build();
 
+    private static final List<PropertyDescriptor> properties = Collections.unmodifiableList(Arrays.asList(
+        FILESYSTEM_WITH_DEFAULT,
+        DIRECTORY_WITH_DEFAULT,
+        FILE,
+        ADLS_CREDENTIALS_SERVICE,
+        RANGE_START,
+        RANGE_LENGTH,
+        NUM_RETRIES

Review Comment:
   Similar to `FetchAzureBlobStorage_v12`: `FILE` and `RANGE_*`properties should not be split, I believe.



##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureBlobStorage.java:
##########
@@ -63,12 +57,22 @@
 import org.apache.nifi.processors.azure.storage.utils.BlobInfo.Builder;
 import org.apache.nifi.serialization.record.RecordSchema;
 
+import java.io.IOException;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.EnumSet;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
 import java.util.Optional;
 
 @PrimaryNodeOnly
 @TriggerSerially
 @Tags({ "azure", "microsoft", "cloud", "storage", "blob" })
-@SeeAlso({ FetchAzureBlobStorage.class, PutAzureBlobStorage.class, DeleteAzureBlobStorage.class })
+@SeeAlso({ ListAzureBlobStorage_v12.class, FetchAzureBlobStorage.class, PutAzureBlobStorage.class, DeleteAzureBlobStorage.class })

Review Comment:
   It is a good idea to refer the v12 List processor here. Similar references could be added to the Put / Delete / Fetch processors too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] markap14 commented on a diff in pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
markap14 commented on code in PR #5944:
URL: https://github.com/apache/nifi/pull/5944#discussion_r876045387


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/FetchAzureBlobStorage.java:
##########
@@ -85,6 +79,22 @@ public class FetchAzureBlobStorage extends AbstractAzureBlobProcessor {
             .required(false)
             .build();
 
+    private static final List<PropertyDescriptor> properties = Collections.unmodifiableList(Arrays.asList(
+        AzureStorageUtils.CONTAINER_WITH_DEFAULT_VALUE,
+        BLOB,
+        AzureStorageUtils.STORAGE_CREDENTIALS_SERVICE,
+        AzureStorageUtils.ACCOUNT_NAME,
+        AzureStorageUtils.ACCOUNT_KEY,
+        AzureStorageUtils.PROP_SAS_TOKEN,
+        AzureStorageUtils.ENDPOINT_SUFFIX,
+        RANGE_START,
+        RANGE_LENGTH,
+        AzureBlobClientSideEncryptionUtils.CSE_KEY_TYPE,
+        AzureBlobClientSideEncryptionUtils.CSE_KEY_ID,
+        AzureBlobClientSideEncryptionUtils.CSE_SYMMETRIC_KEY_HEX,

Review Comment:
   They are related to the blob. But they are far less important. Meaning that they will be configured far less often. The most important (read: the most commonly configured) properties should be first.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] markap14 commented on a diff in pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
markap14 commented on code in PR #5944:
URL: https://github.com/apache/nifi/pull/5944#discussion_r876021000


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/FetchAzureBlobStorage_v12.java:
##########
@@ -109,11 +100,11 @@ public class FetchAzureBlobStorage_v12 extends AbstractAzureBlobProcessor_v12 {
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            STORAGE_CREDENTIALS_SERVICE,
-            CONTAINER,
-            BLOB_NAME,
-            RANGE_START,
-            RANGE_LENGTH
+        CONTAINER,
+        BLOB_NAME,
+        STORAGE_CREDENTIALS_SERVICE,
+        RANGE_START,
+        RANGE_LENGTH

Review Comment:
   @turcsanyip that makes sense from a purely organizational "like things should go together" perspective. When a user goes to configure a processor, though, there are certain things that the user will definitely expect to configure. The first thing that comes to mind is "What data do I want to fetch?" That's configured via the container and blob name. Next is authentication. So that should be the first thing presented to the user. We want to present the most important configuration options first. After that, it makes sense to group things with similar importance by categories. So so RANGE_START should definitely be located with RANGE_LENGTH. But it doesn't really make sense put those before credentials because they are far less important (in terms of what the user will want to configure). And it doesn't make sense to me to configure the Container, then the Credentials, then the Blob, because the Container and Blob together represent "what's the data that I want?"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] markap14 commented on a diff in pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
markap14 commented on code in PR #5944:
URL: https://github.com/apache/nifi/pull/5944#discussion_r876029524


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureBlobStorage_v12.java:
##########
@@ -131,14 +132,14 @@ public class ListAzureBlobStorage_v12 extends AbstractListProcessor<BlobInfo> {
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            STORAGE_CREDENTIALS_SERVICE,
-            CONTAINER,
-            BLOB_NAME_PREFIX,
-            RECORD_WRITER,
-            LISTING_STRATEGY,
-            TRACKING_STATE_CACHE,
-            TRACKING_TIME_WINDOW,
-            INITIAL_LISTING_TARGET
+        CONTAINER,
+        LISTING_STRATEGY,
+        RECORD_WRITER,
+        STORAGE_CREDENTIALS_SERVICE,
+        BLOB_NAME_PREFIX,
+        TRACKING_STATE_CACHE,
+        TRACKING_TIME_WINDOW,
+        INITIAL_LISTING_TARGET

Review Comment:
   I can agree with keeping the Blob Name Prefix with container. I do think that should come before the credentials. I think I'd tend to put listing strategy before credentials but honestly can go either way. But will move the Blob Name Prefix up to colocate with container, as I regard them as both related to one another and of equal importance.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] turcsanyip commented on a diff in pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
turcsanyip commented on code in PR #5944:
URL: https://github.com/apache/nifi/pull/5944#discussion_r877361359


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/FetchAzureBlobStorage_v12.java:
##########
@@ -109,11 +100,11 @@ public class FetchAzureBlobStorage_v12 extends AbstractAzureBlobProcessor_v12 {
             .build();
 
     private static final List<PropertyDescriptor> PROPERTIES = Collections.unmodifiableList(Arrays.asList(
-            STORAGE_CREDENTIALS_SERVICE,
-            CONTAINER,
-            BLOB_NAME,
-            RANGE_START,
-            RANGE_LENGTH
+        CONTAINER,
+        BLOB_NAME,
+        STORAGE_CREDENTIALS_SERVICE,
+        RANGE_START,
+        RANGE_LENGTH

Review Comment:
   @markap14 indeed, moving `STORAGE_CREDENTIALS_SERVICE` down further, I suggested in the comment, was not a good idea.
   Actually, when it was placed at the first position originally, I also followed the "What data do I want to fetch?" approach. In Azure the container name is not globally unique (unlike bucket name on S3) but the container exists within a Storage Account. So, for identifying the data, the user needs: Storage Account > Container > Blob. `STORAGE_CREDENTIALS_SERVICE` contains the `Storage Account Name`, that's why it was placed before `Container` (and also because the credential is a property to be configured always).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] jfrazee commented on pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
jfrazee commented on PR #5944:
URL: https://github.com/apache/nifi/pull/5944#issuecomment-1093089854

   @markap14 I looked these through and they all look like good changes (which you probably don't need to be told). There's one thing I wanted to call out that I think might be relevant with changing the default to `${filename}`. I think it will break a quick creation of List -> Fetch without a user having to edit Fetch. The gist is that `azure.blobname` and `filename` aren't the same. The former includes the full path and the latter is just the basename (or it was that way).
   
   See https://github.com/apache/nifi/pull/3906#issuecomment-623016616


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] markap14 commented on a diff in pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
markap14 commented on code in PR #5944:
URL: https://github.com/apache/nifi/pull/5944#discussion_r876048112


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/ListAzureBlobStorage.java:
##########
@@ -140,6 +144,15 @@ protected Map<String, String> createAttributes(BlobInfo entity, ProcessContext c
         attributes.put("mime.type", entity.getContentType());
         attributes.put("lang", entity.getContentLanguage());
 
+        final String entityName = entity.getBlobName();
+        if (entityName.contains("/")) {
+            attributes.put(CoreAttributes.FILENAME.key(), StringUtils.substringAfterLast(entityName, "/"));
+            attributes.put(CoreAttributes.PATH.key(), StringUtils.substringBeforeLast(entityName, "/"));
+        } else {
+            attributes.put(CoreAttributes.FILENAME.key(), entityName);
+            attributes.put(CoreAttributes.PATH.key(), ".");

Review Comment:
   I don't believe so. I think `.` makes the most sense. I can definitely see the argument for why you'd say that `/` makes the most sense. However, consider that we then want to maintain the same directory structure when pushing the data elsewhere. Let's say we have ListAzureBlobStorage -> FetchAzureBlobStorage -> PutFile for instance. We don't want PutFile trying to write the data to the root of the file system. We want it writing to whatever directory it's configured to write to. So a value of `.` is essentially "the root of where you're configured to write to" vs. "the root of the file system."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] markap14 commented on a diff in pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
markap14 commented on code in PR #5944:
URL: https://github.com/apache/nifi/pull/5944#discussion_r876049199


##########
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/DeleteAzureBlobStorage.java:
##########
@@ -64,24 +64,32 @@ public class DeleteAzureBlobStorage extends AbstractAzureBlobProcessor {
             .required(true)
             .build();
 
+    private static final List<PropertyDescriptor> properties = Collections.unmodifiableList(Arrays.asList(
+        AzureStorageUtils.CONTAINER_WITH_DEFAULT_VALUE,
+        BLOB,
+        AzureStorageUtils.STORAGE_CREDENTIALS_SERVICE,
+        AzureStorageUtils.ACCOUNT_NAME,
+        AzureStorageUtils.ACCOUNT_KEY,
+        AzureStorageUtils.PROP_SAS_TOKEN,
+        AzureStorageUtils.ENDPOINT_SUFFIX,
+        AzureStorageUtils.PROXY_CONFIGURATION_SERVICE,
+        DELETE_SNAPSHOTS_OPTION));

Review Comment:
   Agreed that it's related to the blob, but based on importance, it should be low.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi] github-actions[bot] commented on pull request #5944: NIFI-9892: Updated Azure storage related processors to adhere to NiFi…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #5944:
URL: https://github.com/apache/nifi/pull/5944#issuecomment-1254358210

   We're marking this PR as stale due to lack of updates in the past few months. If after another couple of weeks the stale label has not been removed this PR will be closed. This stale marker and eventual auto close does not indicate a judgement of the PR just lack of reviewer bandwidth and helps us keep the PR queue more manageable.  If you would like this PR re-opened you can do so and a committer can remove the stale tag.  Or you can open a new PR.  Try to help review other PRs to increase PR review bandwidth which in turn helps yours.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org