You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Peter Koszek (JIRA)" <ji...@apache.org> on 2010/03/19 17:22:35 UTC
[jira] Issue Comment Edited: (SANDBOX-263) Excel strategy uses
wrong separator
[ https://issues.apache.org/jira/browse/SANDBOX-263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847419#action_12847419 ]
Peter Koszek edited comment on SANDBOX-263 at 3/19/10 4:22 PM:
---------------------------------------------------------------
RFC 4180 defines commas to be field separators.
The Excel strategy uses the local configuration to identify the separator.
The following approach can help to predict a field separator:
On Windows, read registry key "HKCU\Control Panel\International\sList".
On other systems, try to avoid a collision with the floating point separator like this:
// The following idea is based on a comment from
// http://www.experts-exchange.com/Programming/Languages/Q_24113673.html
DecimalFormatSymbols dfs = DecimalFormatSymbols.getInstance(Locale.getDefault());
char decimalSeparator = dfs.getDecimalSeparator();
char listSeparator = ',';
if (decimalSeparator == listSeparator) {
// If the floating point separator is a comma, use semi-colon to minimize encapsulation
listSeparator = ';';
}
CSV should be a standard, Excel is a specific application which uses the CSV standard in a special way.
I wouldn't expect a CSV framework to be able to simulate Excel exactly.
CSV based formatting works with every arbitrary separator character.
I expect a CSV framework to fully support the standard and to give me the possibility to configure individual solutions.
was (Author: peko):
RFC 4180 defines commas to be field separators.
The Excel strategy uses the local configuration to identify the separator.
The following approach can help to predict a field separator:
On Windows, read registry key "HKCU\Control Panel\International\sList".
On other systems, try to avoid a collision with the floating point separator like this:
// The following idea is based on a comment from
// http://www.experts-exchange.com/Programming/Languages/Q_24113673.html
DecimalFormatSymbols dfs = DecimalFormatSymbols.getInstance(Locale.getDefault());
char decimalSeparator = dfs.getDecimalSeparator();
char listSeparator = ',';
if (decimalSeparator == listSeparator) {
// If the floating point separator is a comma, use semi-colon to minimize encapsulation
listSeparator = ';';
}
CSV should be a standard, Excel is a specific application which uses the CSV standard in a special way.
I wouldn't expect a CSV framework to be able to simulate Excel exactly.
CSV based formatting works with every arbitrary separator character.
I expect a CSV framework to fully support the standard and to give me the possibility to configure individual solutions.
> Excel strategy uses wrong separator
> -----------------------------------
>
> Key: SANDBOX-263
> URL: https://issues.apache.org/jira/browse/SANDBOX-263
> Project: Commons Sandbox
> Issue Type: Bug
> Components: CSV
> Reporter: Gunnar Wagenknecht
>
> The Excel strategy is defined as follows.
> {code}
> public static CSVStrategy EXCEL_STRATEGY = new CSVStrategy(',', '"', COMMENTS_DISABLED, ESCAPE_DISABLED, false,
> false, false, false);
> {code}
> However, when I do a "Save as" in Excel the separator used is actually {{';'}}. Thus, parsing the CSV file as suggested in the JavaDoc of {{CSVParser}} fails.
> {code}
> String[][] data =
> (new CSVParser(new StringReader("a;b\nc;d"), CSVStrategy.EXCEL_STRATEGY)).getAllValues();
> {code}
> Simple test to reproduce:
> {code}
> import java.io.IOException;
> import java.io.StringReader;
> import org.apache.commons.csv.CSVParser;
> import org.apache.commons.csv.CSVStrategy;
> public class CSVExcelStrategyBug {
> public static void main(final String[] args) {
> try {
> System.out.println("Using ;");
> parse("a;b\nc;d");
> System.out.println();
> System.out.println("Using ,");
> parse("a,b\nc,d");
> } catch (final IOException e) {
> e.printStackTrace();
> }
> }
> private static void parse(final String input) throws IOException {
> final String[][] data = (new CSVParser(new StringReader(input), CSVStrategy.EXCEL_STRATEGY)).getAllValues();
> for (final String[] row : data) {
> System.out.print("[");
> for (final String cell : row) {
> System.out.print("(" + cell + ")");
> }
> System.out.println("]");
> }
> }
> }
> {code}
> Actual output:
> {noformat}
> Using ;
> [(a;b)]
> [(c;d)]
> Using ,
> [(a)(b)]
> [(c)(d)]
> {noformat}
> Expected output:
> {noformat}
> Using ;
> [(a)(b)]
> [(c)(d)]
> Using ,
> [(a,b)]
> [(c,d)]
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.