You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2021/07/26 07:12:14 UTC

[GitHub] [nifi] timeabarna opened a new pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

timeabarna opened a new pull request #5249:
URL: https://github.com/apache/nifi/pull/5249


   …r controller service
   https://issues.apache.org/jira/browse/NIFI-8761
   
   #### Description of PR
   
   Enables to use CSVReader and CSVRecordSetWriter without escape character
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [ ] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically `main`)?
   
   - [ ] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on JDK 8?
   - [ ] Have you verified that the full build is successful on JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pgyori commented on pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
pgyori commented on pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#issuecomment-900092826


   Amending my previous comment:
   `"abc\def"` turned into `"abc\\def"` because I was using JSONRecordSetWriter, and in JSON it is normal to escape the backslash. When using CSVRecordSetWriter with CSVReader, the "issue" does not occur, so it is fine, no bug to fix there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] adenes merged pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
adenes merged pull request #5249:
URL: https://github.com/apache/nifi/pull/5249


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pvillard31 commented on a change in pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
pvillard31 commented on a change in pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#discussion_r681593124



##########
File path: nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-standard-record-utils/src/main/java/org/apache/nifi/csv/CSVUtils.java
##########
@@ -274,7 +275,12 @@ private static CSVFormat buildCustomFormat(final PropertyContext context, final
         final Character quoteChar = getCharUnescaped(context, QUOTE_CHAR, variables);
         format = format.withQuote(quoteChar);
 
-        final Character escapeChar = getCharUnescaped(context, ESCAPE_CHAR, variables);
+        final Character escapeChar;
+        if (context.getProperty(CSVUtils.ESCAPE_CHAR).evaluateAttributeExpressions(variables).getValue().isEmpty()) {
+            escapeChar = null;
+        } else {
+            escapeChar = getCharUnescaped(context, ESCAPE_CHAR, variables);
+        }

Review comment:
       Would the below be easier to read?
   
   ```suggestion
           final Character escapeChar = context.getProperty(CSVUtils.ESCAPE_CHAR).evaluateAttributeExpressions(variables).getValue().isEmpty() ? null : getCharUnescaped(context, ESCAPE_CHAR, variables);
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pgyori commented on a change in pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
pgyori commented on a change in pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#discussion_r689668781



##########
File path: nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-standard-record-utils/src/main/java/org/apache/nifi/csv/CSVValidators.java
##########
@@ -25,15 +25,18 @@
 import java.util.Set;
 
 public class CSVValidators {
+    private static final Set<String> illegalChars = new HashSet<>();
 
-    public static class SingleCharacterValidator implements Validator {
-        private static final Set<String> illegalChars = new HashSet<>();
+    static {
+        illegalChars.add("\r");
+        illegalChars.add("\n");
+    }
 
-        static {
-            illegalChars.add("\r");
-            illegalChars.add("\n");
-        }
+    public static final Validator SINGLE_CHAR_VALIDATOR = createSingleCharValidator(false);
+
+    public static final Validator EMPTY_AND_SINGLE_CHAR_VALIDATOR = createSingleCharValidator(true);

Review comment:
       I recommend naming it EMPTY_OR_SINGLE_CHAR_VALIDATOR as it contains a logical OR in the implementation. When reading the code of CSVUtils, at `.addValidator(CSVValidators.EMPTY_AND_SINGLE_CHAR_VALIDATOR)` I got a little confused. My first impression was that the property is required to be empty and contain a single character at the same time.

##########
File path: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/csv/TestCSVValidators.java
##########
@@ -33,48 +33,109 @@
     /*** SingleCharValidator **/
     @Test
     public void testSingleCharNullValue() {
-        CSVValidators.SingleCharacterValidator validator = new CSVValidators.SingleCharacterValidator();
+        Validator validator = CSVValidators.SINGLE_CHAR_VALIDATOR;
         ValidationContext mockContext = Mockito.mock(ValidationContext.class);
         ValidationResult result = validator.validate("EscapeChar", null, mockContext);
         assertEquals("Input is null for this property", result.getExplanation());
         assertFalse(result.isValid());
     }
 
+    @Test
+    public void testSingleCharEmptyValue() {
+        Validator validator = CSVValidators.SINGLE_CHAR_VALIDATOR;
+        ValidationContext mockContext = Mockito.mock(ValidationContext.class);
+        ValidationResult result = validator.validate("EscapeChar", "", mockContext);
+        assertEquals("Value must be exactly 1 character but was 0 in length", result.getExplanation());
+        assertFalse(result.isValid());
+    }
+
     @Test
     public void testSingleCharTab() {
-        CSVValidators.SingleCharacterValidator validator = new CSVValidators.SingleCharacterValidator();
+        Validator validator = CSVValidators.SINGLE_CHAR_VALIDATOR;
         ValidationContext mockContext = Mockito.mock(ValidationContext.class);
         ValidationResult result = validator.validate("EscapeChar", "\\t", mockContext);
         assertTrue(result.isValid());
     }
 
     @Test
     public void testSingleCharIllegalChar() {
-        CSVValidators.SingleCharacterValidator validator = new CSVValidators.SingleCharacterValidator();
+        Validator validator = CSVValidators.SINGLE_CHAR_VALIDATOR;
         ValidationContext mockContext = Mockito.mock(ValidationContext.class);
         ValidationResult result = validator.validate("EscapeChar", "\\r", mockContext);
         assertEquals("\\r is not a valid character for this property", result.getExplanation());
         assertFalse(result.isValid());
     }
 
+    @Test
+    public void testSingleCharExpressionLanguage() {
+        Validator validator = CSVValidators.SINGLE_CHAR_VALIDATOR;
+        ValidationContext mockContext = Mockito.mock(ValidationContext.class);
+        Mockito.when(mockContext.isExpressionLanguageSupported(Mockito.any())).thenReturn(true);
+        Mockito.when(mockContext.isExpressionLanguagePresent(Mockito.any())).thenReturn(true);
+        ValidationResult result = validator.validate("EscapeChar", "${csv.escape}", mockContext);
+        assertTrue(result.isValid());
+    }
+
     @Test
     public void testSingleCharGoodChar() {
-        CSVValidators.SingleCharacterValidator validator = new CSVValidators.SingleCharacterValidator();
+        Validator validator = CSVValidators.SINGLE_CHAR_VALIDATOR;
         ValidationContext mockContext = Mockito.mock(ValidationContext.class);
         ValidationResult result = validator.validate("EscapeChar", "'", mockContext);
         assertTrue(result.isValid());
     }
 
+    /*** Empty And SingleCharValidator **/

Review comment:
       Being consistent with my other comment, I would replace "And" with "Or".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] timeabarna edited a comment on pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
timeabarna edited a comment on pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#issuecomment-900005124


   @pgyori regarding I found some behavior that is not caused by this modification, but came up for me while testing this. If the input data is:
   "abc\def"
   and the escape character is not specified (null), I found that in the output we get:
   "abc\\def"
   
   Is the behaviour is the same if you select null escape character for the Writer as well?
   I think it makes sense to add \ before an \ if you select \ as an escape character for the Writer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] timeabarna commented on pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
timeabarna commented on pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#issuecomment-892498509


   Thanks @pvillard31 for your help, applied your recommendation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] timeabarna removed a comment on pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
timeabarna removed a comment on pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#issuecomment-900005124


   @pgyori regarding I found some behavior that is not caused by this modification, but came up for me while testing this. If the input data is:
   "abc\def"
   and the escape character is not specified (null), I found that in the output we get:
   "abc\\def"
   
   Is the behaviour is the same if you select null escape character for the Writer as well?
   I think it makes sense to add \ before an \ if you select \ as an escape character for the Writer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] timeabarna commented on pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
timeabarna commented on pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#issuecomment-900097537


   Thanks @pgyori for your help, I've renamed the validator based on your recommendation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] timeabarna edited a comment on pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
timeabarna edited a comment on pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#issuecomment-900005124


   @pgyori regarding I found some behavior that is not caused by this modification, but came up for me while testing this. If the input data is:
   "abc\def"
   and the escape character is not specified (null), I found that in the output we get:
   "abc\\def"
   
   Is the behaviour is the same if you select null escape character for the Writer as well?
   I think it makes sense to add '\' before an '\' if you select '\' as an escape character for the Writer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] adenes commented on pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
adenes commented on pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#issuecomment-904602336


   Thanks @timeabarna for the contribution and @pgyori for the review. Merging to main.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] timeabarna commented on pull request #5249: NIFI-8761 Enable not setting a value for Escape Character in CSVReade…

Posted by GitBox <gi...@apache.org>.
timeabarna commented on pull request #5249:
URL: https://github.com/apache/nifi/pull/5249#issuecomment-900005124


   @pgyori Iregarding I found some behavior that is not caused by this modification, but came up for me while testing this. If the input data is:
   "abc\def"
   and the escape character is not specified (null), I found that in the output we get:
   "abc\\def"
   
   Is the behaviour is the same if you select null escape character for the Writer as well?
   I think it makes sense to add '\' before an '\' if you select '\' as an escape character for the Writer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org