You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/10/28 21:14:00 UTC

[jira] [Work logged] (CSV-253) Handle absent values in input (null)

     [ https://issues.apache.org/jira/browse/CSV-253?focusedWorklogId=335197&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-335197 ]

ASF GitHub Bot logged work on CSV-253:
--------------------------------------

                Author: ASF GitHub Bot
            Created on: 28/Oct/19 21:13
            Start Date: 28/Oct/19 21:13
    Worklog Time Spent: 10m 
      Work Description: lbruun commented on pull request #51: [CSV-253] Handle absent values in input
URL: https://github.com/apache/commons-csv/pull/51
 
 
   Being able to appropriately translate an absent value in CSV input with a Java `null` value. Previously, there was no way to do this, such a value would at best become a zero-length string when parsing. This made it impossible to correctly parse CSV output from say databases.
   
   In reference to [CSV-253](https://issues.apache.org/jira/browse/CSV-253).
   
   The PR addresses the issue by adding a flag on `Token` so that it becomes possible to distinguish between a token which is the result of an absent value in input or an actual zero-length string. A new modifier, `absentIsNull` is introduced on `CSVFormat`.   All existing formats and functionality is kept as-is, meaning the new feature is fully based on opt-in.
   
   As a possible next step the pre-defined CSV formats for databases (i.e. `INFORMIX_UNLOAD_CSV`, `MYSQL`, `ORACLE` and `POSTGRESQL_CSV`) should be reviewed. I suspect that at least `POSTGRESQL_CSV` has always been incorrect in this matter. With this PR is can be corrected.
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 335197)
    Remaining Estimate: 0h
            Time Spent: 10m

> Handle absent values in input (null)
> ------------------------------------
>
>                 Key: CSV-253
>                 URL: https://issues.apache.org/jira/browse/CSV-253
>             Project: Commons CSV
>          Issue Type: Improvement
>          Components: Parser
>            Reporter: Lars Bruun-Hansen
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The parser must be able to handle absent values in input and translate that into {{null}} as required. I see several tickets on this matter in the history, but none seem to have addressed the issue, at least not for parsing. 
> For this problem, I see a need to introduce a new term:
> Definition: _Absent value_ is when there are zero characters between field delimiters.
> Specifically the aim is to be able to parse the following:
> {noformat}
>     "John",,"Doe"    // 2nd element is absent
>     ,"AA",123        // 1st element is absent
>     "John",90,       // 3rd element is absent
>     "",,90           // 2nd element is absent (1st element isn't)
> {noformat}
>  
> See also CSV-93 which I think never addressed the issue, probably because the reporter was happy with having the issue fixed for CSV output, not for parsing.
> A PR is coming...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)