You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by "Inger, Matthew" <in...@Synygy.com> on 2003/11/13 21:19:33 UTC

[Bug 22692] - StringUtils.split ignores empty items

FYI:  I have submitted the DelimitedTokenizer class.
Could one of the committers please review this defect,
and commit the new files I have uploaded?  Or, i'd be
open to being a committer myself, and just checking it
in using cvs.


-----Original Message-----
From: bugzilla@apache.org [mailto:bugzilla@apache.org]
Sent: Wednesday, November 12, 2003 10:00 AM
To: commons-dev@jakarta.apache.org
Subject: DO NOT REPLY [Bug 22692] - StringUtils.split ignores empty
items


DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=22692>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=22692

StringUtils.split ignores empty items





------- Additional Comments From mattinger@yahoo.com  2003-11-12 14:59
-------
The attachment uploaded at 14:56 supercedes the one uploaded at 13:20

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org

Re: [Bug 22692] - StringUtils.split ignores empty items

Posted by Stephen Colebourne <sc...@btopenworld.com>.
Thank you for your submission. The implementation looks to have the basics
of what is needed for a StringTokenizer replacement. My suggestions:

1) The implementation is perhaps a little too CSV focussed at present. For
example, by default I would expect settings similar to StringTokenizer,
splitting on whitespace.

2) There is no ability to suport multiple delimiters or multiple quote
tokens. Related to #2.

3) There seems to be no way to ignore null/empty strings (ie. not return
them)

4) The coding style doesn't match the rest of [lang], ie. curly brackets
most noticeably

5) Implement java.util.Iterator to gives extra flexibility. (no need to
implement remove()) Keep nextToken() of course!

6) Maybe add nextTokenAsBoolean(), nextTokenAsInt() to handle the most
common conversions when reading a known format file like CSV.

I definitely want to see a Tokenizer in [lang], and this looks like a good
start. (I suggest Tokenizer is a sufficiently good name). We also need to
ensure that it performs well!
Thanks
Stephen

----- Original Message -----
From: "Inger, Matthew" <in...@Synygy.com>
> FYI:  I have submitted the DelimitedTokenizer class.
> Could one of the committers please review this defect,
> and commit the new files I have uploaded?  Or, i'd be
> open to being a committer myself, and just checking it
> in using cvs.
>
>
> -----Original Message-----
> From: bugzilla@apache.org [mailto:bugzilla@apache.org]
> Sent: Wednesday, November 12, 2003 10:00 AM
> To: commons-dev@jakarta.apache.org
> Subject: DO NOT REPLY [Bug 22692] - StringUtils.split ignores empty
> items
>
>
> DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
> RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
> <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=22692>.
> ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
> INSERTED IN THE BUG DATABASE.
>
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=22692
>
> StringUtils.split ignores empty items
>
>
>
>
>
> ------- Additional Comments From mattinger@yahoo.com  2003-11-12 14:59
> -------
> The attachment uploaded at 14:56 supercedes the one uploaded at 13:20
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org