You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Patrick Gäckle (JIRA)" <ji...@apache.org> on 2018/04/03 08:26:00 UTC

[jira] [Comment Edited] (CSV-222) invalid char between encapsulated token and delimiter

    [ https://issues.apache.org/jira/browse/CSV-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419448#comment-16419448 ] 

Patrick Gäckle edited comment on CSV-222 at 4/3/18 8:25 AM:
------------------------------------------------------------

This is the option I'd like to use but how can I set them to these non printable characters?
Maybe it would be nice to include the position in the log statement as another hint where to search.

I'd really would like to see some option to just leave characters not identified as in colum aside.


was (Author: lostkatana):
This is the current workaround  I use.
Maybe it would be nice to include the position in the log statement as another hint where to search.

I'd really would like to see some option to just leave characters not identified as in colum aside.

> invalid char between encapsulated token and delimiter
> -----------------------------------------------------
>
>                 Key: CSV-222
>                 URL: https://issues.apache.org/jira/browse/CSV-222
>             Project: Commons CSV
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: 1.4
>            Reporter: Patrick Gäckle
>            Priority: Major
>         Attachments: faulty.csv
>
>
> When trying to read the file [^faulty.csv] and parse it I get the following error:
> {code}
> java.io.IOException: (line 1) invalid char between encapsulated token and delimiter
> 	at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
> 	at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
> 	at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
> 	at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
> 	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
> 	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
> 	at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
> {code}
> The line of code is the parsing part returning the iterator of it:
> {code:java}
> csvFormat = CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
> iterator = csvFormat.parse(reader).iterator();
> {code}
> The invalid char is the contained SOH and STX non printable characters at the end of line.
> I debugged through the source of this and ran into the Exception in the Lexer not handling these special characters
> Unfortunately I'm not able to provide some hints on fixing this as I'm not familiar with these type of characters and what behaviour they should have.
> Sincerely



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)