You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2017/05/26 22:43:05 UTC

[jira] [Updated] (PIG-4623) Fixed the 'new line' character inside double-quote causing the csv parsing failure

     [ https://issues.apache.org/jira/browse/PIG-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-4623:
------------------------------------
    Fix Version/s:     (was: 0.17.0)
                   0.18.0

> Fixed the 'new line' character inside double-quote causing the csv parsing failure
> ----------------------------------------------------------------------------------
>
>                 Key: PIG-4623
>                 URL: https://issues.apache.org/jira/browse/PIG-4623
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.15.0
>            Reporter: Ken Wu
>            Assignee: Ken Wu
>             Fix For: 0.18.0
>
>         Attachments: CSVLoader.java, PIG-4623-1.patch, TestCSVStorage.java
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A new line character should be allowed inside a double quote as a valid csv document. For example, the following csv document should be treated as a SINGLE valid csv data
> Iphone,"{ ItemName : Cheez-It
> 21 Ounce}",
> However, the current implementation of the getNext() inside org.apache.pig.piggybank.storage.CSVLoader class fails to take care of this case and it sees two lines of data while in fact it should be treated as single line of data.
> This pull request fixes the above issue.
> (Note: here is a linke to validate whether a csv document: http://csvlint.io/)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)