You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Ed Berezitsky (JIRA)" <ji...@apache.org> on 2018/12/06 12:52:00 UTC

[jira] [Commented] (NIFI-5874) CSVReader and CSVRecordSetWriter inject transformed backslash sequences from input

    [ https://issues.apache.org/jira/browse/NIFI-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711403#comment-16711403 ] 

Ed Berezitsky commented on NIFI-5874:
-------------------------------------

Attached template to fully reproduce this issue.

Even if initial CSV fully complies with CSV standards and escapes backslash with double backslash (\\t), first UpdateRecord will write output with single backslash, and then next UpdateRecord will convert it into the tab.

> CSVReader and CSVRecordSetWriter inject transformed backslash sequences from input
> ----------------------------------------------------------------------------------
>
>                 Key: NIFI-5874
>                 URL: https://issues.apache.org/jira/browse/NIFI-5874
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 1.8.0
>            Reporter: Ed Berezitsky
>            Priority: Major
>         Attachments: csv_bug.xml
>
>
> If there is backslash sequence (like \t, \n, etc) in the input, CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) in output record.
> For example, input record:
>  
> {code:java}
> case,a,a1
> tab,=\t=,-
> {code}
>  
> Update Record with `/a1: /a` (just copy value from one field to another)
> JsonRecordSetWriter will produce:
> {code:java}
> [{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
> and CSVRecordSetWriter will produce:
> {code:java}
> case,a,a1
> tab,= =,= =
> {code}
> there is a actual "tab" in between "="
>  In JSON objecr above, \t mean escaped tab. The actual issue is coming from both CSV Reader and Writer.
> Reader converts unescaped sequence of characters into actual character, but Writer doesn't escape them back when writes results, while JSON Writer does that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)