You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Kasonnara (Jira)" <ji...@apache.org> on 2021/03/29 16:59:00 UTC

[jira] [Created] (NIFI-8377) CSVReader: quoting and trimming with value separator inconsistency

Kasonnara created NIFI-8377:
-------------------------------

             Summary: CSVReader: quoting and trimming with value separator inconsistency
                 Key: NIFI-8377
                 URL: https://issues.apache.org/jira/browse/NIFI-8377
             Project: Apache NiFi
          Issue Type: Bug
          Components: Extensions
    Affects Versions: 1.13.2, 1.12.1
            Reporter: Kasonnara
         Attachments: template-test-CSVReader-for-bug-report.xml

There is a little inconsistency of quoting and trimming when the value separator is present in the data and using Apache Common CSV parser.

Example:
{noformat}
case, A, B
quoted value,"aa",
quoted and trimmed value, "aa" ,
quoted value with comma,"a,a",
trimmed but wrongly unquoted value with comma, "a,a" ,{noformat}
{color:#000000}here in the 3 first cases, the value is correctly parsed
{color}
{noformat}
A : "aa", B : null{noformat}
{noformat}
A : "aa", B : null{noformat}
{noformat}
A : "a,a", B : null{noformat}
{color:#000000}so using separately quoting containing the value separator or spaces to trim works well.{color}
 
{color:#000000}However in the last example that combine quoted value separator and outer spaces to trim, then quoting fails{color}
{noformat}
A : "\"a", B : "a\""{noformat}
{color:#000000} {color}
{color:#000000}I think setting org.apache.commons.csv.CSVFormat.withIgnoreSurroundingSpaces(true) on the CSV parser would solve the issue, but I don't see the whole picture to tell if this would have other side effects.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)