You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by "Daniel Gredler (JIRA)" <ji...@apache.org> on 2006/07/29 01:31:21 UTC

[jira] Created: (SANDBOX-161) CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard

CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard
----------------------------------------------------------------

                 Key: SANDBOX-161
                 URL: http://issues.apache.org/jira/browse/SANDBOX-161
             Project: Commons Sandbox
          Issue Type: Bug
          Components: CSV
    Affects Versions: Nightly Builds
            Reporter: Daniel Gredler


All the descriptions of the CSV format that I've seen state that:

- Double quotes (") are escaped using two double quotes (""), rather than a backslash (\").
- Embedded line breaks are allowed and don't need to be escaped... just enclose the field in double quotes.
- Because backslashes are not used to escape double quotes or line breaks, the backslashes themselves do not need to be escaped.

CSVPrinter#escapeAndQuote(String) breaks these rules. Why?

http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
http://en.wikipedia.org/wiki/Comma-separated_values


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[jira] Commented: (SANDBOX-161) CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard

Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/SANDBOX-161?page=comments#action_12424768 ] 
            
Henri Yandell commented on SANDBOX-161:
---------------------------------------

I'll look into fixing this, but patches are welcome.

At a rough guess it means that an encapsulatorEscapeChar should be added to the CSVStrategy, and the code changed to look at this. Both the Parser and Printer.

> CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard
> ----------------------------------------------------------------
>
>                 Key: SANDBOX-161
>                 URL: http://issues.apache.org/jira/browse/SANDBOX-161
>             Project: Commons Sandbox
>          Issue Type: Bug
>          Components: CSV
>    Affects Versions: Nightly Builds
>            Reporter: Daniel Gredler
>
> All the descriptions of the CSV format that I've seen state that:
> - Double quotes (") are escaped using two double quotes (""), rather than a backslash (\").
> - Embedded line breaks are allowed and don't need to be escaped... just enclose the field in double quotes.
> - Because backslashes are not used to escape double quotes or line breaks, the backslashes themselves do not need to be escaped.
> CSVPrinter#escapeAndQuote(String) breaks these rules. Why?
> http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
> http://en.wikipedia.org/wiki/Comma-separated_values

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[jira] Commented: (SANDBOX-161) CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard

Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/SANDBOX-161?page=comments#action_12424761 ] 
            
Henri Yandell commented on SANDBOX-161:
---------------------------------------

"" is the one I've always seen too. 

http://tools.ietf.org/html/rfc4180 has that as well (from the wikipedia link).

> CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard
> ----------------------------------------------------------------
>
>                 Key: SANDBOX-161
>                 URL: http://issues.apache.org/jira/browse/SANDBOX-161
>             Project: Commons Sandbox
>          Issue Type: Bug
>          Components: CSV
>    Affects Versions: Nightly Builds
>            Reporter: Daniel Gredler
>
> All the descriptions of the CSV format that I've seen state that:
> - Double quotes (") are escaped using two double quotes (""), rather than a backslash (\").
> - Embedded line breaks are allowed and don't need to be escaped... just enclose the field in double quotes.
> - Because backslashes are not used to escape double quotes or line breaks, the backslashes themselves do not need to be escaped.
> CSVPrinter#escapeAndQuote(String) breaks these rules. Why?
> http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
> http://en.wikipedia.org/wiki/Comma-separated_values

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[jira] Commented: (SANDBOX-161) CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/SANDBOX-161?page=comments#action_12447474 ] 
            
Yonik Seeley commented on SANDBOX-161:
--------------------------------------

While correct CVS escaping/parsing should definitely be supported, I think backslash escaping should be retained as an option since it's so common.


> CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard
> ----------------------------------------------------------------
>
>                 Key: SANDBOX-161
>                 URL: http://issues.apache.org/jira/browse/SANDBOX-161
>             Project: Commons Sandbox
>          Issue Type: Bug
>          Components: CSV
>    Affects Versions: Nightly Builds
>            Reporter: Daniel Gredler
>         Attachments: commons-csv-patch-standard-escaping.txt
>
>
> All the descriptions of the CSV format that I've seen state that:
> - Double quotes (") are escaped using two double quotes (""), rather than a backslash (\").
> - Embedded line breaks are allowed and don't need to be escaped... just enclose the field in double quotes.
> - Because backslashes are not used to escape double quotes or line breaks, the backslashes themselves do not need to be escaped.
> CSVPrinter#escapeAndQuote(String) breaks these rules. Why?
> http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
> http://en.wikipedia.org/wiki/Comma-separated_values

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[jira] Updated: (SANDBOX-161) CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard

Posted by "Daniel Gredler (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/SANDBOX-161?page=all ]

Daniel Gredler updated SANDBOX-161:
-----------------------------------

    Attachment: commons-csv-patch-standard-escaping.txt

Attaching a patch to fix this... a couple of notes:

 - I would talk to whoever initially wrote this code before applying; the non-standard escape mechanism was very intentional, and I still don't understand why.
 - One of the unit tests checked all sorts of non-standard backslash-escaping corner cases; it no longer applied, so I removed it.
 - The patch makes a couple of trivial fixes to some typos and smelly one-liners... sorry, I couldn't help myself!
 - I tried to format my code so that it matches the surrounding code, but it looks like there are multiple styles used throughout. What's up with that?

Anyways, check it out and let me know what you think!

> CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard
> ----------------------------------------------------------------
>
>                 Key: SANDBOX-161
>                 URL: http://issues.apache.org/jira/browse/SANDBOX-161
>             Project: Commons Sandbox
>          Issue Type: Bug
>          Components: CSV
>    Affects Versions: Nightly Builds
>            Reporter: Daniel Gredler
>         Attachments: commons-csv-patch-standard-escaping.txt
>
>
> All the descriptions of the CSV format that I've seen state that:
> - Double quotes (") are escaped using two double quotes (""), rather than a backslash (\").
> - Embedded line breaks are allowed and don't need to be escaped... just enclose the field in double quotes.
> - Because backslashes are not used to escape double quotes or line breaks, the backslashes themselves do not need to be escaped.
> CSVPrinter#escapeAndQuote(String) breaks these rules. Why?
> http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
> http://en.wikipedia.org/wiki/Comma-separated_values

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[jira] Commented: (SANDBOX-161) CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard

Posted by "Daniel Gredler (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/SANDBOX-161?page=comments#action_12424835 ] 
            
Daniel Gredler commented on SANDBOX-161:
----------------------------------------

I agree about the "encapsulatorEscapeChar" CSVStrategy property (maybe using the shorter name "escapeChar"). My first intuition when I ran across this problem was to look for exactly this kind of property.

> CSVPrinter#escapeAndQuote(String) doesn't adhere to CSV standard
> ----------------------------------------------------------------
>
>                 Key: SANDBOX-161
>                 URL: http://issues.apache.org/jira/browse/SANDBOX-161
>             Project: Commons Sandbox
>          Issue Type: Bug
>          Components: CSV
>    Affects Versions: Nightly Builds
>            Reporter: Daniel Gredler
>
> All the descriptions of the CSV format that I've seen state that:
> - Double quotes (") are escaped using two double quotes (""), rather than a backslash (\").
> - Embedded line breaks are allowed and don't need to be escaped... just enclose the field in double quotes.
> - Because backslashes are not used to escape double quotes or line breaks, the backslashes themselves do not need to be escaped.
> CSVPrinter#escapeAndQuote(String) breaks these rules. Why?
> http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
> http://en.wikipedia.org/wiki/Comma-separated_values

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org