You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Emmanuel Bourg <eb...@micropole-univers.com> on 2004/02/12 18:27:18 UTC

[lang] split & join

Hi, i noticed that the split method in StringUtils is not the reverse 
operation of join, is this intended ? The split method treats adjacent 
separators as one separator unlike the Perl and JDK 1.4 split functions. 
That means it's not possible to join an array and then split the result 
to get a similar array, that's quite annoying when manipulating CSV 
records. For example:

String[] tab1 = new String[] { "a", "b", "", "d" };

String[] tab2 = StringUtils.split(StringUtils.join(tab1, ';'), ';');

here tab2 = { "a", "b", "d" }, the 3rd element of tab1 is lost.

That may be nice to have a flag on the split methods indicating if the 
separators must be merged, or a new set of methods (slice()?) with the 
same signatures and handling empty elements.

Emmanuel Bourg


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [lang] split & join

Posted by "Todd V. Jonker" <to...@consciouscode.com>.
Stick with split, otherwise users will have to remember the difference
between split and divide or chop or hack or cut or slice or some other
arbitrary word.  The slight semantic difference is far from obvious in
the name.

This is why I am constantly digging through the [lang] API docs to
remember the difference between (for example) clean() and trim() and 
strip().  There are subtle semantic differences hidden, none of which are
indicated by the method names, which are all equally arbitrary and
generic.  I find it continually frustrating.  Let's not make that mistake
again.

.T.


On Thu, 12 Feb 2004 19:08:29 +0100, "Emmanuel Bourg"
<eb...@micropole-univers.com> said:
> Joe Germuska wrote:
> 
> > Maybe the best would just be an alternate method signature to split(...) ?
> 
> This solution is a bit ugly i think, but it's the only one if we can't 
> find a better name. What about divide(), break() or cut() ?
> 
> Emmanuel Bourg


-- 
Todd V. Jonker

Conscious Code Ltd
The Practice of Programming
www.consciouscode.com

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [lang] split & join

Posted by Emmanuel Bourg <eb...@micropole-univers.com>.
Joe Germuska wrote:

> Maybe the best would just be an alternate method signature to split(...) ?

This solution is a bit ugly i think, but it's the only one if we can't 
find a better name. What about divide(), break() or cut() ?

Emmanuel Bourg


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [lang] split & join

Posted by Joe Germuska <Jo...@Germuska.com>.
At 6:27 PM +0100 2/12/04, Emmanuel Bourg wrote:
>Hi, i noticed that the split method in StringUtils is not the 
>reverse operation of join, is this intended ? The split method 
>treats adjacent separators as one separator unlike the Perl and JDK 
>1.4 split functions. That means it's not possible to join an array 
>and then split the result to get a similar array, that's quite 
>annoying when manipulating CSV records. For example:
>
>String[] tab1 = new String[] { "a", "b", "", "d" };
>
>String[] tab2 = StringUtils.split(StringUtils.join(tab1, ';'), ';');
>
>here tab2 = { "a", "b", "d" }, the 3rd element of tab1 is lost.
>
>That may be nice to have a flag on the split methods indicating if 
>the separators must be merged, or a new set of methods (slice()?) 
>with the same signatures and handling empty elements.

I think this came up and was ruled to be something that we're "stuck 
with" for backwards compatibility.  Of course, there could be new 
methods, regardless of the names.

I'd suggest against "slice," though, since it has a specific meaning 
in Perl (a sub-region of an array), and the perl parallels are so 
strong otherwise that using one term differently would confuse people 
horribly.

Maybe the best would just be an alternate method signature to split(...) ?

Joe

-- 
Joe Germuska            
Joe@Germuska.com  
http://blog.germuska.com    
       "Imagine if every Thursday your shoes exploded if you tied them 
the usual way.  This happens to us all the time with computers, and 
nobody thinks of complaining."
             -- Jef Raskin

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org