You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Jisun, Shin (JIRA)" <ji...@apache.org> on 2018/05/28 07:38:00 UTC

[jira] [Created] (CSV-227) first column always quoting when multilingual language, when not on second column

Jisun, Shin created CSV-227:
-------------------------------

             Summary: first column always quoting when multilingual language, when not on second column
                 Key: CSV-227
                 URL: https://issues.apache.org/jira/browse/CSV-227
             Project: Commons CSV
          Issue Type: Bug
          Components: Parser
    Affects Versions: 1.5
            Reporter: Jisun, Shin


when including multilingual  character (utf-8 encoding),

CSVPrinter always quote only first column, not other columns.

 
{code:java}
//  example code
CSVFormat format = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL);

CSVPrinter printer = new CSVPrinter(System.out, format);

List<String[]> temp = new ArrayList<String[]>();

temp.add(new String[] { "ㅁㅎㄷㄹ", "ㅁㅎㄷㄹ", "", "test2" });
temp.add(new String[] { "한글3", "hello3", "3한글3", "test3" });
temp.add(new String[] { "", "hello4", "", "test4" });

for (String[] temp1 : temp) {
printer.printRecord(temp1);
}
printer.close();
{code}
 

result =>

"ㅁㅎㄷㄹ",ㅁㅎㄷㄹ,,test2
"한글3",hello3,3한글3,test3
"",hello4,,test4

 

i found the code.

multilingual charaters are out of  0x7E. first record and multilinguage  always print quotes.

  
{code:java}
// CSVFormat.class
...
1173: char c = value.charAt(pos);
1174: 
1175: // RFC4180 (https://tools.ietf.org/html/rfc4180) TEXTDATA = %x20-21 / %x23-2B / %x2D-7E
1176: if (newRecord && (c < 0x20 || c > 0x21 && c < 0x23 || c > 0x2B && c < 0x2D || c > 0x7E)) {
1177: quote = true;
1178: } else if (c <= COMMENT) {
...{code}
 

would you fix this bug?

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)