You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2020/07/23 11:11:58 UTC

[GitHub] [nifi] kent-nguyen opened a new pull request #4423: NIFI-7664 - Add Content Disposition property to PutS3Object processor

kent-nguyen opened a new pull request #4423:
URL: https://github.com/apache/nifi/pull/4423


   Add new property 'Content Disposition' to allow user
   to set the content-disposition http header on the S3 object.
   
   Allowed values are 'inline' (default) and 'attachment'.
   If 'attachment' is selected, the filename will be set to the S3 Object key.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] turcsanyip commented on a change in pull request #4423: NIFI-7664 - Add Content Disposition property to PutS3Object processor

Posted by GitBox <gi...@apache.org>.
turcsanyip commented on a change in pull request #4423:
URL: https://github.com/apache/nifi/pull/4423#discussion_r462886495



##########
File path: nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java
##########
@@ -166,13 +166,18 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
             final ObjectMetadata metadata = s3Object.getObjectMetadata();
             if (metadata.getContentDisposition() != null) {
                 final String fullyQualified = metadata.getContentDisposition();

Review comment:
       `fullyQualified` could be called `contentDisposition` now.

##########
File path: nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java
##########
@@ -166,13 +166,18 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
             final ObjectMetadata metadata = s3Object.getObjectMetadata();
             if (metadata.getContentDisposition() != null) {
                 final String fullyQualified = metadata.getContentDisposition();
-                final int lastSlash = fullyQualified.lastIndexOf("/");
-                if (lastSlash > -1 && lastSlash < fullyQualified.length() - 1) {
-                    attributes.put(CoreAttributes.PATH.key(), fullyQualified.substring(0, lastSlash));
-                    attributes.put(CoreAttributes.ABSOLUTE_PATH.key(), fullyQualified);
-                    attributes.put(CoreAttributes.FILENAME.key(), fullyQualified.substring(lastSlash + 1));
+
+                if (fullyQualified.equals(PutS3Object.CONTENT_DISPOSITION_INLINE) || fullyQualified.equals(PutS3Object.CONTENT_DISPOSITION_ATTACHMENT)) {

Review comment:
       In case of `attachment`, the condition will never be true. It should be `startsWith("attachment; filename=")` or similar.
   
   In case of `inline`, it is not possible to distinguish between `inline` content disposition and a file called `inline` but in my opinion we can ignore it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] turcsanyip commented on a change in pull request #4423: NIFI-7664 - Add Content Disposition property to PutS3Object processor

Posted by GitBox <gi...@apache.org>.
turcsanyip commented on a change in pull request #4423:
URL: https://github.com/apache/nifi/pull/4423#discussion_r464122120



##########
File path: nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/test/java/org/apache/nifi/processors/aws/s3/TestPutS3Object.java
##########
@@ -171,12 +172,11 @@ public void testFilenameWithNationalCharacters() throws UnsupportedEncodingExcep
 
         runner.run(1);
 
-        ArgumentCaptor<PutObjectRequest> captureRequest = ArgumentCaptor.forClass(PutObjectRequest.class);
-        Mockito.verify(mockS3Client, Mockito.times(1)).putObject(captureRequest.capture());
-        PutObjectRequest request = captureRequest.getValue();
+        runner.assertAllFlowFilesTransferred(PutS3Object.REL_SUCCESS, 1);
+        List<MockFlowFile> flowFiles = runner.getFlowFilesForRelationship(PutS3Object.REL_SUCCESS);
+        MockFlowFile ff = flowFiles.get(0);
 
-        ObjectMetadata objectMetadata = request.getMetadata();
-        assertEquals("Iñtërnâtiônàližætiøn.txt", URLDecoder.decode(objectMetadata.getContentDisposition(), "UTF-8"));
+        assertEquals("Iñtërnâtiônàližætiøn.txt", URLDecoder.decode(ff.getAttribute(CoreAttributes.FILENAME.key()), "UTF-8"));

Review comment:
       This test was written to test if the Content-Disposition header is being passed to AWS in URL encoded form.
   So to check if the Content-Disposition is `I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0li%C5%BE%C3%A6ti%C3%B8n.txt` for the test data.
   
   The `filename` FlowFile attribute is not URL encoded, it simply contains `Iñtërnâtiônàližætiøn.txt`. URLDecoder does not do anything with it and the assert will always be true.
   
   Please revert back to the original test if there is no other reason to modify that.
   
   One change in the assert seems to me reasonable though:
   `assertEquals(URLEncoder.encode("Iñtërnâtiônàližætiøn.txt", "UTF-8"), objectMetadata.getContentDisposition());`
   (the original assert would be true even if the Content-Disposition was not URL encoded)

##########
File path: nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java
##########
@@ -165,14 +165,19 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
 
             final ObjectMetadata metadata = s3Object.getObjectMetadata();
             if (metadata.getContentDisposition() != null) {
-                final String fullyQualified = metadata.getContentDisposition();
-                final int lastSlash = fullyQualified.lastIndexOf("/");
-                if (lastSlash > -1 && lastSlash < fullyQualified.length() - 1) {
-                    attributes.put(CoreAttributes.PATH.key(), fullyQualified.substring(0, lastSlash));
-                    attributes.put(CoreAttributes.ABSOLUTE_PATH.key(), fullyQualified);
-                    attributes.put(CoreAttributes.FILENAME.key(), fullyQualified.substring(lastSlash + 1));
+                final String contentDisposition = metadata.getContentDisposition();
+
+                if (contentDisposition.equals(PutS3Object.CONTENT_DISPOSITION_INLINE) || contentDisposition.startsWith("attachment; filename=")) {
+                    attributes.put(CoreAttributes.FILENAME.key(), key);

Review comment:
       Setting the`filename` to the full object key (eg. `dir1/dir2/file1`) is a bit inconsistent because `filename` typically does not contain the "path" (like in the else branch below and in case of other processor too).
   I would suggest to use the simple file name:
   - `dir1/dir2/file1` => `file1`
   - `file2` => `file2`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] kent-nguyen commented on a change in pull request #4423: NIFI-7664 - Add Content Disposition property to PutS3Object processor

Posted by GitBox <gi...@apache.org>.
kent-nguyen commented on a change in pull request #4423:
URL: https://github.com/apache/nifi/pull/4423#discussion_r464188817



##########
File path: nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java
##########
@@ -165,14 +165,19 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
 
             final ObjectMetadata metadata = s3Object.getObjectMetadata();
             if (metadata.getContentDisposition() != null) {
-                final String fullyQualified = metadata.getContentDisposition();
-                final int lastSlash = fullyQualified.lastIndexOf("/");
-                if (lastSlash > -1 && lastSlash < fullyQualified.length() - 1) {
-                    attributes.put(CoreAttributes.PATH.key(), fullyQualified.substring(0, lastSlash));
-                    attributes.put(CoreAttributes.ABSOLUTE_PATH.key(), fullyQualified);
-                    attributes.put(CoreAttributes.FILENAME.key(), fullyQualified.substring(lastSlash + 1));
+                final String contentDisposition = metadata.getContentDisposition();
+
+                if (contentDisposition.equals(PutS3Object.CONTENT_DISPOSITION_INLINE) || contentDisposition.startsWith("attachment; filename=")) {
+                    attributes.put(CoreAttributes.FILENAME.key(), key);

Review comment:
       @turcsanyip Thanks for your explanation, it is clear and helps me a lot. I have updated the unit test and attribute setting code in FetchS3Object processor.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] turcsanyip commented on a change in pull request #4423: NIFI-7664 - Add Content Disposition property to PutS3Object processor

Posted by GitBox <gi...@apache.org>.
turcsanyip commented on a change in pull request #4423:
URL: https://github.com/apache/nifi/pull/4423#discussion_r460399383



##########
File path: nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/PutS3Object.java
##########
@@ -451,7 +466,6 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
                 public void process(final InputStream rawIn) throws IOException {
                     try (final InputStream in = new BufferedInputStream(rawIn)) {
                         final ObjectMetadata objectMetadata = new ObjectMetadata();
-                        objectMetadata.setContentDisposition(URLEncoder.encode(ff.getAttribute(CoreAttributes.FILENAME.key()), "UTF-8"));

Review comment:
       `Content-Disposition` was originally set to the filename.
   The current change is not backward compatible with it because the default is `inline` and the "no value" case also falls back to `inline`.
   I would suggest to use no default value and to keep the current logic (setting the filename) when the property is not specified. This way it would not be a breaking change and could be added in a minor release (which may have backward compatible changes only).
   Furthermore, as far as I understand, `inline` / `attachment` are only relevant in case of web hosting mode. So having a third option for non web hosting mode seems to me reasonable too.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] kent-nguyen commented on a change in pull request #4423: NIFI-7664 - Add Content Disposition property to PutS3Object processor

Posted by GitBox <gi...@apache.org>.
kent-nguyen commented on a change in pull request #4423:
URL: https://github.com/apache/nifi/pull/4423#discussion_r463908965



##########
File path: nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java
##########
@@ -166,13 +166,18 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
             final ObjectMetadata metadata = s3Object.getObjectMetadata();
             if (metadata.getContentDisposition() != null) {
                 final String fullyQualified = metadata.getContentDisposition();
-                final int lastSlash = fullyQualified.lastIndexOf("/");
-                if (lastSlash > -1 && lastSlash < fullyQualified.length() - 1) {
-                    attributes.put(CoreAttributes.PATH.key(), fullyQualified.substring(0, lastSlash));
-                    attributes.put(CoreAttributes.ABSOLUTE_PATH.key(), fullyQualified);
-                    attributes.put(CoreAttributes.FILENAME.key(), fullyQualified.substring(lastSlash + 1));
+
+                if (fullyQualified.equals(PutS3Object.CONTENT_DISPOSITION_INLINE) || fullyQualified.equals(PutS3Object.CONTENT_DISPOSITION_ATTACHMENT)) {

Review comment:
       @turcsanyip I have rebased my branch on main. I have also updated the PR to update the variable name and correct the condition as requested. Please let me know if there's anything else I can do. Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] asfgit closed pull request #4423: NIFI-7664 - Add Content Disposition property to PutS3Object processor

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4423:
URL: https://github.com/apache/nifi/pull/4423


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] turcsanyip commented on pull request #4423: NIFI-7664 - Add Content Disposition property to PutS3Object processor

Posted by GitBox <gi...@apache.org>.
turcsanyip commented on pull request #4423:
URL: https://github.com/apache/nifi/pull/4423#issuecomment-666225798


   @kent-nguyen Please rebase your branch onto the current main branch instead of merging the branches. I cannot squash your commits otherwise (getting merge conflict when squashing).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] turcsanyip commented on pull request #4423: NIFI-7664 - Add Content Disposition property to PutS3Object processor

Posted by GitBox <gi...@apache.org>.
turcsanyip commented on pull request #4423:
URL: https://github.com/apache/nifi/pull/4423#issuecomment-666225798


   @kent-nguyen Please rebase your branch onto the current main branch instead of merging the branches. I cannot squash your commits otherwise (getting merge conflict when squashing).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org