You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Emmanuel Bourg (JIRA)" <ji...@apache.org> on 2008/04/15 14:47:05 UTC
[jira] Created: (LANG-426) String splitting with escaped delimiter
String splitting with escaped delimiter
---------------------------------------
Key: LANG-426
URL: https://issues.apache.org/jira/browse/LANG-426
Project: Commons Lang
Issue Type: New Feature
Affects Versions: 2.4
Reporter: Emmanuel Bourg
Priority: Minor
Fix For: 3.0
In Commons Configuration we use a custom split method that supports the concept of an escaped delimiter, that may be nice if this was available in Commons Lang (as a method in StringUtils, or as a setting in StrTokenizer).
Example:
{code}
a,b\,c,d -> ["a", "b,c", "d"]
{code}
Here is the code of the method:
{code:java}
public static List<String> split(String s, char delimiter)
{
if (s == null)
{
return new ArrayList<String>();
}
List<String> list = new ArrayList<String>();
StringBuilder token = new StringBuilder();
int begin = 0;
boolean inEscape = false;
while (begin < s.length())
{
char c = s.charAt(begin);
if (inEscape)
{
// last character was the escape marker
// can current character be escaped?
if (c != delimiter && c != LIST_ESC_CHAR)
{
// no, also add escape character
token.append(LIST_ESC_CHAR);
}
token.append(c);
inEscape = false;
}
else
{
if (c == delimiter)
{
// found a list delimiter -> add token and reset buffer
list.add(token.toString().trim());
token = new StringBuilder();
}
else if (c == LIST_ESC_CHAR)
{
// eventually escape next character
inEscape = true;
}
else
{
token.append(c);
}
}
begin++;
}
// Trailing delimiter?
if (inEscape)
{
token.append(LIST_ESC_CHAR);
}
// Add last token
list.add(token.toString().trim());
return list;
}
{code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (LANG-426) String splitting with escaped
delimiter
Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595882#action_12595882 ]
Henri Yandell commented on LANG-426:
------------------------------------
Need to write a unit test.
> String splitting with escaped delimiter
> ---------------------------------------
>
> Key: LANG-426
> URL: https://issues.apache.org/jira/browse/LANG-426
> Project: Commons Lang
> Issue Type: New Feature
> Affects Versions: 2.4
> Reporter: Emmanuel Bourg
> Priority: Minor
> Fix For: 3.0
>
>
> In Commons Configuration we use a custom split method that supports the concept of an escaped delimiter, that may be nice if this was available in Commons Lang (as a method in StringUtils, or as a setting in StrTokenizer).
> Example:
> {code}
> a,b\,c,d -> ["a", "b,c", "d"]
> {code}
> Here is the code of the method:
> {code:java}
> public static List<String> split(String s, char delimiter)
> {
> if (s == null)
> {
> return new ArrayList<String>();
> }
> List<String> list = new ArrayList<String>();
> StringBuilder token = new StringBuilder();
> int begin = 0;
> boolean inEscape = false;
> while (begin < s.length())
> {
> char c = s.charAt(begin);
> if (inEscape)
> {
> // last character was the escape marker
> // can current character be escaped?
> if (c != delimiter && c != LIST_ESC_CHAR)
> {
> // no, also add escape character
> token.append(LIST_ESC_CHAR);
> }
> token.append(c);
> inEscape = false;
> }
> else
> {
> if (c == delimiter)
> {
> // found a list delimiter -> add token and reset buffer
> list.add(token.toString().trim());
> token = new StringBuilder();
> }
> else if (c == LIST_ESC_CHAR)
> {
> // eventually escape next character
> inEscape = true;
> }
> else
> {
> token.append(c);
> }
> }
> begin++;
> }
> // Trailing delimiter?
> if (inEscape)
> {
> token.append(LIST_ESC_CHAR);
> }
> // Add last token
> list.add(token.toString().trim());
> return list;
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (LANG-426) String splitting with escaped delimiter
Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henri Yandell updated LANG-426:
-------------------------------
Moving to 3.x. I don't think this would be backwards incompatible so can be done later.
> String splitting with escaped delimiter
> ---------------------------------------
>
> Key: LANG-426
> URL: https://issues.apache.org/jira/browse/LANG-426
> Project: Commons Lang
> Issue Type: New Feature
> Components: lang.text.*
> Affects Versions: 2.4
> Reporter: Emmanuel Bourg
> Priority: Minor
> Fix For: 3.x
>
>
> In Commons Configuration we use a custom split method that supports the concept of an escaped delimiter, that may be nice if this was available in Commons Lang (as a method in StringUtils, or as a setting in StrTokenizer).
> Example:
> {code}
> a,b\,c,d -> ["a", "b,c", "d"]
> {code}
> Here is the code of the method:
> {code:java}
> public static List<String> split(String s, char delimiter)
> {
> if (s == null)
> {
> return new ArrayList<String>();
> }
> List<String> list = new ArrayList<String>();
> StringBuilder token = new StringBuilder();
> int begin = 0;
> boolean inEscape = false;
> while (begin < s.length())
> {
> char c = s.charAt(begin);
> if (inEscape)
> {
> // last character was the escape marker
> // can current character be escaped?
> if (c != delimiter && c != LIST_ESC_CHAR)
> {
> // no, also add escape character
> token.append(LIST_ESC_CHAR);
> }
> token.append(c);
> inEscape = false;
> }
> else
> {
> if (c == delimiter)
> {
> // found a list delimiter -> add token and reset buffer
> list.add(token.toString().trim());
> token = new StringBuilder();
> }
> else if (c == LIST_ESC_CHAR)
> {
> // eventually escape next character
> inEscape = true;
> }
> else
> {
> token.append(c);
> }
> }
> begin++;
> }
> // Trailing delimiter?
> if (inEscape)
> {
> token.append(LIST_ESC_CHAR);
> }
> // Add last token
> list.add(token.toString().trim());
> return list;
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (LANG-426) String splitting with escaped
delimiter
Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778079#action_12778079 ]
Henri Yandell commented on LANG-426:
------------------------------------
With regardisng to StringUtils - Biggest concern is that this balloons the API. There are currently 4 split methods, plus 10 other splitByXyz type methods. I think StrTokenizer is the better place to pursue this.
> String splitting with escaped delimiter
> ---------------------------------------
>
> Key: LANG-426
> URL: https://issues.apache.org/jira/browse/LANG-426
> Project: Commons Lang
> Issue Type: New Feature
> Affects Versions: 2.4
> Reporter: Emmanuel Bourg
> Priority: Minor
> Fix For: 3.0
>
>
> In Commons Configuration we use a custom split method that supports the concept of an escaped delimiter, that may be nice if this was available in Commons Lang (as a method in StringUtils, or as a setting in StrTokenizer).
> Example:
> {code}
> a,b\,c,d -> ["a", "b,c", "d"]
> {code}
> Here is the code of the method:
> {code:java}
> public static List<String> split(String s, char delimiter)
> {
> if (s == null)
> {
> return new ArrayList<String>();
> }
> List<String> list = new ArrayList<String>();
> StringBuilder token = new StringBuilder();
> int begin = 0;
> boolean inEscape = false;
> while (begin < s.length())
> {
> char c = s.charAt(begin);
> if (inEscape)
> {
> // last character was the escape marker
> // can current character be escaped?
> if (c != delimiter && c != LIST_ESC_CHAR)
> {
> // no, also add escape character
> token.append(LIST_ESC_CHAR);
> }
> token.append(c);
> inEscape = false;
> }
> else
> {
> if (c == delimiter)
> {
> // found a list delimiter -> add token and reset buffer
> list.add(token.toString().trim());
> token = new StringBuilder();
> }
> else if (c == LIST_ESC_CHAR)
> {
> // eventually escape next character
> inEscape = true;
> }
> else
> {
> token.append(c);
> }
> }
> begin++;
> }
> // Trailing delimiter?
> if (inEscape)
> {
> token.append(LIST_ESC_CHAR);
> }
> // Add last token
> list.add(token.toString().trim());
> return list;
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (LANG-426) String splitting with escaped delimiter
Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henri Yandell updated LANG-426:
-------------------------------
Fix Version/s: (was: 3.0)
3.x
> String splitting with escaped delimiter
> ---------------------------------------
>
> Key: LANG-426
> URL: https://issues.apache.org/jira/browse/LANG-426
> Project: Commons Lang
> Issue Type: New Feature
> Components: lang.text.*
> Affects Versions: 2.4
> Reporter: Emmanuel Bourg
> Priority: Minor
> Fix For: 3.x
>
>
> In Commons Configuration we use a custom split method that supports the concept of an escaped delimiter, that may be nice if this was available in Commons Lang (as a method in StringUtils, or as a setting in StrTokenizer).
> Example:
> {code}
> a,b\,c,d -> ["a", "b,c", "d"]
> {code}
> Here is the code of the method:
> {code:java}
> public static List<String> split(String s, char delimiter)
> {
> if (s == null)
> {
> return new ArrayList<String>();
> }
> List<String> list = new ArrayList<String>();
> StringBuilder token = new StringBuilder();
> int begin = 0;
> boolean inEscape = false;
> while (begin < s.length())
> {
> char c = s.charAt(begin);
> if (inEscape)
> {
> // last character was the escape marker
> // can current character be escaped?
> if (c != delimiter && c != LIST_ESC_CHAR)
> {
> // no, also add escape character
> token.append(LIST_ESC_CHAR);
> }
> token.append(c);
> inEscape = false;
> }
> else
> {
> if (c == delimiter)
> {
> // found a list delimiter -> add token and reset buffer
> list.add(token.toString().trim());
> token = new StringBuilder();
> }
> else if (c == LIST_ESC_CHAR)
> {
> // eventually escape next character
> inEscape = true;
> }
> else
> {
> token.append(c);
> }
> }
> begin++;
> }
> // Trailing delimiter?
> if (inEscape)
> {
> token.append(LIST_ESC_CHAR);
> }
> // Add last token
> list.add(token.toString().trim());
> return list;
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.