You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Emmanuel Bourg <eb...@micropole-univers.com> on 2004/02/12 18:27:18 UTC
[lang] split & join
Hi, i noticed that the split method in StringUtils is not the reverse
operation of join, is this intended ? The split method treats adjacent
separators as one separator unlike the Perl and JDK 1.4 split functions.
That means it's not possible to join an array and then split the result
to get a similar array, that's quite annoying when manipulating CSV
records. For example:
String[] tab1 = new String[] { "a", "b", "", "d" };
String[] tab2 = StringUtils.split(StringUtils.join(tab1, ';'), ';');
here tab2 = { "a", "b", "d" }, the 3rd element of tab1 is lost.
That may be nice to have a flag on the split methods indicating if the
separators must be merged, or a new set of methods (slice()?) with the
same signatures and handling empty elements.
Emmanuel Bourg
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [lang] split & join
Posted by "Todd V. Jonker" <to...@consciouscode.com>.
Stick with split, otherwise users will have to remember the difference
between split and divide or chop or hack or cut or slice or some other
arbitrary word. The slight semantic difference is far from obvious in
the name.
This is why I am constantly digging through the [lang] API docs to
remember the difference between (for example) clean() and trim() and
strip(). There are subtle semantic differences hidden, none of which are
indicated by the method names, which are all equally arbitrary and
generic. I find it continually frustrating. Let's not make that mistake
again.
.T.
On Thu, 12 Feb 2004 19:08:29 +0100, "Emmanuel Bourg"
<eb...@micropole-univers.com> said:
> Joe Germuska wrote:
>
> > Maybe the best would just be an alternate method signature to split(...) ?
>
> This solution is a bit ugly i think, but it's the only one if we can't
> find a better name. What about divide(), break() or cut() ?
>
> Emmanuel Bourg
--
Todd V. Jonker
Conscious Code Ltd
The Practice of Programming
www.consciouscode.com
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [lang] split & join
Posted by Emmanuel Bourg <eb...@micropole-univers.com>.
Joe Germuska wrote:
> Maybe the best would just be an alternate method signature to split(...) ?
This solution is a bit ugly i think, but it's the only one if we can't
find a better name. What about divide(), break() or cut() ?
Emmanuel Bourg
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [lang] split & join
Posted by Joe Germuska <Jo...@Germuska.com>.
At 6:27 PM +0100 2/12/04, Emmanuel Bourg wrote:
>Hi, i noticed that the split method in StringUtils is not the
>reverse operation of join, is this intended ? The split method
>treats adjacent separators as one separator unlike the Perl and JDK
>1.4 split functions. That means it's not possible to join an array
>and then split the result to get a similar array, that's quite
>annoying when manipulating CSV records. For example:
>
>String[] tab1 = new String[] { "a", "b", "", "d" };
>
>String[] tab2 = StringUtils.split(StringUtils.join(tab1, ';'), ';');
>
>here tab2 = { "a", "b", "d" }, the 3rd element of tab1 is lost.
>
>That may be nice to have a flag on the split methods indicating if
>the separators must be merged, or a new set of methods (slice()?)
>with the same signatures and handling empty elements.
I think this came up and was ruled to be something that we're "stuck
with" for backwards compatibility. Of course, there could be new
methods, regardless of the names.
I'd suggest against "slice," though, since it has a specific meaning
in Perl (a sub-region of an array), and the perl parallels are so
strong otherwise that using one term differently would confuse people
horribly.
Maybe the best would just be an alternate method signature to split(...) ?
Joe
--
Joe Germuska
Joe@Germuska.com
http://blog.germuska.com
"Imagine if every Thursday your shoes exploded if you tied them
the usual way. This happens to us all the time with computers, and
nobody thinks of complaining."
-- Jef Raskin
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org