You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Michael Knapp (JIRA)" <ji...@apache.org> on 2012/11/25 00:28:59 UTC
[jira] [Comment Edited] (LANG-860) String split with an escape
pattern
[ https://issues.apache.org/jira/browse/LANG-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503436#comment-13503436 ]
Michael Knapp edited comment on LANG-860 at 11/24/12 11:28 PM:
---------------------------------------------------------------
I beg to differ, commons-csv assumes there can be an escape character, my code assumes there can be an escape pattern. My code handles a much more broad range of problems than CSV. For example, what if you want to get all the parenthesized text out of a document? commons-csv cannot do that because '(' and ')' are different characters. Commons-csv offers no method to retain delimiters that you split on, my code does. Let's say you split on the pattern of open and closed parentheses: no existing split function in commons-lang, and no function in commons-csv, is able to retain the text that matched your regular expression delimiter, but my code does. The code I wrote does not replace commons-csv, nor does it try. Commons-csv handles comments, empty lines, trimming text, and a whole lot more which is out of the scope of my code. Also, if you expect anybody to use commons-csv, you should really put it on the central maven repository, and document it a little more.
was (Author: msknapp):
I beg to differ, commons-csv assumes there can be an escape character, my code assumes there can be an escape pattern. My code handles a much more broad range of problems than CSV. For example, what if you want to get all the parenthesized text out of a document? commons-csv cannot do that because '(' and ')' are different characters. Commons-csv offers no method to retain delimiters that you split on, my code does. Let's say you split on the pattern of open and closed parentheses: no existing split function in commons-lang, and no function in commons-csv, is able to retain the text that matched your delimiter, but my code does. The code I wrote does not replace commons-csv, nor does it try. Commons-csv handles comments, empty lines, trimming text, and a whole lot more which is out of the scope of my code. Also, if you expect anybody to use commons-csv, you should really put it on the central maven repository, and document it a little more.
> String split with an escape pattern
> -----------------------------------
>
> Key: LANG-860
> URL: https://issues.apache.org/jira/browse/LANG-860
> Project: Commons Lang
> Issue Type: Improvement
> Components: lang.*
> Reporter: Michael Knapp
> Priority: Minor
> Labels: patch, split
> Attachments: StringUtilsSplitEscapingly.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> Often times there are strings which are delimited, but certain patterns can escape the delimiter. For example, quotes are used in CSV to escape a comma delimiter. I have written a couple methods for StringUtils that split strings while considering the possibility of an escape pattern. For example, when given "a,\"b,c\",c", it will produce {"a","\"b,c\"","c"}. In my code, the delimiter can be a string, and it can be escaped by any regular expression pattern. Unit tests are already written and passing.
> I plan to attach the patch for this once the ticket is created. I just need a committer to review the patch, approve, and commit it for me.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira