You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Bryan Bende (JIRA)" <ji...@apache.org> on 2018/01/26 20:57:00 UTC
[jira] [Created] (NIFI-4822) ValidateRecord does not maintain order
of CSV records
Bryan Bende created NIFI-4822:
---------------------------------
Summary: ValidateRecord does not maintain order of CSV records
Key: NIFI-4822
URL: https://issues.apache.org/jira/browse/NIFI-4822
Project: Apache NiFi
Issue Type: Bug
Affects Versions: 1.5.0, 1.4.0
Reporter: Bryan Bende
If you have ValidateRecord configured with a CSV reader and CSV writer and send in some valid data, the flow file is routed to "valid", but the columns are written out in a different order than there were read.
This means if the next processor is another record-oriented processor using the exact same schema and reader, it will fail to read it because the first column won't be what it expects.
From doing some digging, it appears that in WriteCsvResult there is a method getFieldNames() that does this:
{code:java}
final Set<String> allFields = new LinkedHashSet<>();
allFields.addAll(record.getRawFieldNames());
allFields.addAll(recordSchema.getFieldNames());{code}
In this case, record.getRawFieldNames() is coming from the keyset of a HashMap which means it is not maintaining the order the fields were read in.
CsvRecordReader line 97:
{code:java}
final Map<String, Object> values = new HashMap<>(recordFields.size() * 2);{code}
{color:#000080} {color}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)