You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@nifi.apache.org by "Bryan Bende (JIRA)" <ji...@apache.org> on 2018/01/26 20:57:00 UTC

[jira] [Created] (NIFI-4822) ValidateRecord does not maintain order of CSV records

Bryan Bende created NIFI-4822:
---------------------------------

             Summary: ValidateRecord does not maintain order of CSV records
                 Key: NIFI-4822
                 URL: https://issues.apache.org/jira/browse/NIFI-4822
             Project: Apache NiFi
          Issue Type: Bug
    Affects Versions: 1.5.0, 1.4.0
            Reporter: Bryan Bende


If you have ValidateRecord configured with a CSV reader and CSV writer and send in some valid data, the flow file is routed to "valid", but the columns are written out in a different order than there were read.

This means if the next processor is another record-oriented processor using the exact same schema and reader, it will fail to read it because the first column won't be what it expects.

From doing some digging, it appears that in WriteCsvResult there is a method getFieldNames() that does this:
{code:java}
final Set<String> allFields = new LinkedHashSet<>();
allFields.addAll(record.getRawFieldNames());
allFields.addAll(recordSchema.getFieldNames());{code}
In this case, record.getRawFieldNames() is coming from the keyset of a HashMap which means it is not maintaining the order the fields were read in.

CsvRecordReader line 97:
{code:java}
final Map<String, Object> values = new HashMap<>(recordFields.size() * 2);{code}
{color:#000080} {color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)