You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Patrick Gäckle (JIRA)" <ji...@apache.org> on 2018/03/21 11:06:00 UTC

[jira] [Updated] (CSV-222) invalid char between encapsulated token and delimiter

     [ https://issues.apache.org/jira/browse/CSV-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Gäckle updated CSV-222:
-------------------------------
    Description: 
When trying to read the file [^faulty.csv] and parse it I get the following error:

{code}
java.io.IOException: (line 1) invalid char between encapsulated token and delimiter
	at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
	at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
	at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
	at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
	at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
{code}

The line of code is the parsing part returning the iterator of it:

{code:java}
csvFormat = CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
iterator = csvFormat.parse(reader).iterator();
{code}

The invalid char is the contained SOH and STX non printable characters at the end of line.
I debugged through the source of this and ran into the Exception in {noformat}Lexer#parseEncapsulatedToken{noformat}.

Unfortunately I'm not able to provide some hints on fixing this as I'm not familiar with these type of characters and what behaviour they should have.

Sincerely

  was:
When trying to read the file [^faulty.csv] and parse it I get the folowwing error:

{code}
java.io.IOException: (line 1) invalid char between encapsulated token and delimiter
	at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
	at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
	at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
	at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
	at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
{code}

The line of code is the parsing part returning the iterator of it:

{code:java}
csvFormat = CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
iterator = csvFormat.parse(reader).iterator();
{code}

The invalid char is the contained SOH and STX non printable characters at the end of line.
I debugged through the source of this and ran into the Exception in {noformat}Lexer#parseEncapsulatedToken{noformat}.

Unfortunately I'm not able to provide some hints on fixing this as I'm not familiar with these type of characters and what behaviour they should have.

Sincerely


> invalid char between encapsulated token and delimiter
> -----------------------------------------------------
>
>                 Key: CSV-222
>                 URL: https://issues.apache.org/jira/browse/CSV-222
>             Project: Commons CSV
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: 1.4
>            Reporter: Patrick Gäckle
>            Priority: Major
>         Attachments: faulty.csv
>
>
> When trying to read the file [^faulty.csv] and parse it I get the following error:
> {code}
> java.io.IOException: (line 1) invalid char between encapsulated token and delimiter
> 	at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
> 	at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
> 	at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
> 	at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
> 	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
> 	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
> 	at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
> {code}
> The line of code is the parsing part returning the iterator of it:
> {code:java}
> csvFormat = CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
> iterator = csvFormat.parse(reader).iterator();
> {code}
> The invalid char is the contained SOH and STX non printable characters at the end of line.
> I debugged through the source of this and ran into the Exception in {noformat}Lexer#parseEncapsulatedToken{noformat}.
> Unfortunately I'm not able to provide some hints on fixing this as I'm not familiar with these type of characters and what behaviour they should have.
> Sincerely



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)