You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Kasonnara (Jira)" <ji...@apache.org> on 2021/03/29 16:59:00 UTC
[jira] [Created] (NIFI-8377) CSVReader: quoting and trimming with
value separator inconsistency
Kasonnara created NIFI-8377:
-------------------------------
Summary: CSVReader: quoting and trimming with value separator inconsistency
Key: NIFI-8377
URL: https://issues.apache.org/jira/browse/NIFI-8377
Project: Apache NiFi
Issue Type: Bug
Components: Extensions
Affects Versions: 1.13.2, 1.12.1
Reporter: Kasonnara
Attachments: template-test-CSVReader-for-bug-report.xml
There is a little inconsistency of quoting and trimming when the value separator is present in the data and using Apache Common CSV parser.
Example:
{noformat}
case, A, B
quoted value,"aa",
quoted and trimmed value, "aa" ,
quoted value with comma,"a,a",
trimmed but wrongly unquoted value with comma, "a,a" ,{noformat}
{color:#000000}here in the 3 first cases, the value is correctly parsed
{color}
{noformat}
A : "aa", B : null{noformat}
{noformat}
A : "aa", B : null{noformat}
{noformat}
A : "a,a", B : null{noformat}
{color:#000000}so using separately quoting containing the value separator or spaces to trim works well.{color}
{color:#000000}However in the last example that combine quoted value separator and outer spaces to trim, then quoting fails{color}
{noformat}
A : "\"a", B : "a\""{noformat}
{color:#000000} {color}
{color:#000000}I think setting org.apache.commons.csv.CSVFormat.withIgnoreSurroundingSpaces(true) on the CSV parser would solve the issue, but I don't see the whole picture to tell if this would have other side effects.{color}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)