You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/09/25 21:55:00 UTC
[jira] [Work logged] (CSV-290) Produced CSV using PostgreSQL format cannot be read
[ https://issues.apache.org/jira/browse/CSV-290?focusedWorklogId=811953&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-811953 ]
ASF GitHub Bot logged work on CSV-290:
--------------------------------------
Author: ASF GitHub Bot
Created on: 25/Sep/22 21:54
Start Date: 25/Sep/22 21:54
Worklog Time Spent: 10m
Work Description: angusdev opened a new pull request, #265:
URL: https://github.com/apache/commons-csv/pull/265
I tested in psql 14.5 Homebrew in Mac M1.
CSVFormat.POSTGRESQL_CSV - special characters are not escaped.
CSVFormat.POSTGRESQL_TEXT - values are not quoted.
```sql
drop table COMMONS_CSV_PSQL_TEST;
create table COMMONS_CSV_PSQL_TEST (ID INTEGER, COL1 VARCHAR, COL2 VARCHAR, COL3 VARCHAR, COL4 VARCHAR);
insert into COMMONS_CSV_PSQL_TEST select 1, 'abc', 'test line 1' || chr(10) || 'test line 2', null, '';
insert into COMMONS_CSV_PSQL_TEST select 2, 'xyz', '\b:' || chr(8) || ' \n:' || chr(10) || ' \r:' || chr(13), 'a', 'b';
insert into COMMONS_CSV_PSQL_TEST values (3, 'a', 'b,c,d', '"quoted"', 'e');
copy COMMONS_CSV_PSQL_TEST TO '/tmp/psql.csv' WITH (FORMAT CSV);
copy COMMONS_CSV_PSQL_TEST TO '/tmp/psql.tsv';
```
```
cat /tmp/psql.csv
1,abc,"test line 1
test line 2",,""
2,xyz,"\b:^H \n:
\r:^M",a,b
3,a,"b,c,d","""quoted""",e
```
```
cat /tmp/psql.tsv
1 abc test line 1\ntest line 2 \N
2 xyz \\b:\b \\n:\n \\r:\r a b
3 a b,c,d "quoted" e
```
Issue Time Tracking
-------------------
Worklog Id: (was: 811953)
Remaining Estimate: 0h
Time Spent: 10m
> Produced CSV using PostgreSQL format cannot be read
> ---------------------------------------------------
>
> Key: CSV-290
> URL: https://issues.apache.org/jira/browse/CSV-290
> Project: Commons CSV
> Issue Type: Bug
> Components: Parser
> Affects Versions: 1.6, 1.9.0
> Reporter: Anatoliy Artemenko
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {code:java}
> // code placeholder
> {code}
> CSV, produced using printer:
>
> CSVPrinter printer = new CSVPrinter(sw, CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
>
> cannot be be read with same format parser:
>
> CSVParser parser = new CSVParser(new StringReader(sw.toString()), CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
>
> To reproduce:
>
> {code:java}
> StringWriter sw = new StringWriter();
> CSVPrinter printer = new CSVPrinter(sw, CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
> printer.printRecord("column1", "column2");
> printer.printRecord("v11", "v12");
> printer.printRecord("v21", "v22");
> printer.close();
> CSVParser parser = new CSVParser(new StringReader(sw.toString()), CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
> System.out.println("headers: " + Arrays.equals(parser.getHeaderNames().toArray(), new String[] {"column1", "column2"}));
> Iterator<CSVRecord> i = parser.iterator();
> System.out.println("row: " + Arrays.equals(i.next().toList().toArray(), new String[] {"v11", "v12"}));
> System.out.println("row: " + Arrays.equals(i.next().toList().toArray(), new String[] {"v21", "v22"}));{code}
> I'd expect the above code to work, but it fails:
> {code:java}
> java.io.IOException: (startline 1) EOF reached before encapsulated token finishedjava.io.IOException: (startline 1) EOF reached before encapsulated token finished
> at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:371)
> at org.apache.commons.csv.Lexer.nextToken(Lexer.java:285)
> at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:701)
> at org.apache.commons.csv.CSVParser.createHeaders(CSVParser.java:480)
> at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:432)
> at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:398)
> at Test.main(Test.java:25)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)