You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Todd Jonker <tv...@pobox.com> on 2003/07/15 03:38:50 UTC

Re: [lang] Pre 2.0 - StringUtils.isEmpty(), isNotEmpty() and stringsa with somespaces

Hi everyone,

Hen dropped me a note asking for feedback on Stephen's response to my
message to commons-user.

I think the proposal is a definite improvement, but I feel that it still
falls short of the kind of "semantic completeness" that I'd like to see, and
that I generally need in my own code.

Reading my email again, I reconsidered things and changed my mind about the
definition of "empty".  In my experience coders use the term "empty string"
to mean "" and _not_ null or whitespace-only.

I suggest coming up with well-defined terminology, then ensuring that the
library's method names accurately reflect the semantics.  There seems to be
a real need for a term encompassing both "" and null; I suggest the word
"trivial".



MY PROPOSAL
-----------

First the terminology:

   *  "empty" means a String object of zero length.
   *  "trivial" means null or an empty string.
   *  "whitespace" is as defined by Character.isWhitespace().
   *  "blank" means the value includes no non-whitespace characters.


Given these definitions, I believe the set of interesting methods would look
like this.  Columns represent the five interesting input classes.  I only
indicate true results to make the chart more readable; blank cells imply a
false result.


              s:  null    ""    "  "   " x "   "xyz"
                  ----   ----   ----   -----   -----
s == null         TRUE
isEmpty(s)               TRUE
isWhitespace(s)                 TRUE
isTrivial(s)      TRUE   TRUE
! isTrivial(s)                  TRUE    TRUE    TRUE
isBlank(s)        TRUE   TRUE   TRUE
! isBlank(s)                            TRUE    TRUE


I assert that we shouldn't attempt to provide a method for every possible
combination of truth-values in this chart, so instead I just try to hit what
I think are the most common uses (in my experience, of course).

I tend to dislike thinks like isNotBlank since it increases the number of
methods one needs to wade through, but adds no new semantic expressiveness.
Also, the methods above would lead to isNotTrivial, where isNonTrivial is
much more natural


Relative to Stephen's proposal:
    Stephen's  isEmpty         is equivalent to my  isTrivial.
    Stephen's  isEmptyTrimmed  is equivalent to my  isBlank.


Note that I have defined (isEmpty(null) == false) which I think matches what
coders generally mean by "empty string".  Given this definition of isEmpty()
I think that isNotEmpty should be removed.  My rationale is that, if it
exists then it should be the case that, for all input Strings s:

    isNotEmpty(s) ==  ! isEmpty(s)

However, this implies that (isNotEmpty(null) == false) which is probably
counterintuitive.  IMO we can reduce the number of bugs our users write by
removing this method.

Given the methods specified above, current code is ported as follows:

    The current isEmpty(s)     becomes   isBlank(s)
    The current isNotEmpty(s)  becomes   ! isTrivial(s)



ALTERNATIVE PROPOSAL
--------------------

Changing the semantics of isEmpty and isNotEmpty is very problematic since
the transition will be difficult for the library's users.  Even if we
document the changes thoroughly, I strongly suspect many users will not port
their code correctly (if at all).  I think everyone will make the transition
more smoothly and correctly if we just remove (or at least deprecate) both.

As an alternative, we could use 'isZeroLength' instead of 'isEmpty',
resulting in the following set of methods:


              s:  null    ""    "  "   " x "   "xyz"
                  ----   ----   ----   -----   -----
s == null         TRUE
isZeroLength(s)          TRUE
isWhitespace(s)                 TRUE
isTrivial(s)      TRUE   TRUE
! isTrivial(s)                  TRUE    TRUE    TRUE
isBlank(s)        TRUE   TRUE   TRUE
! isBlank(s)                            TRUE    TRUE


Given the methods specified above, current code is ported as follows:

    The current isEmpty(s)     becomes   isBlank(s)
    The current isNotEmpty(s)  becomes   ! isTrivial(s)


Personally I like the first proposal since I think the terminology is more
concise better overall, but porting issues cannot be ignored.  IMO this
alternative proposal would be a great improvement to the current API.


Thanks for giving this proposal your consideration.  Please forward any
responses to me (as well as commons-dev) since I can't always keep up with
the traffic to this list.


.T.


> ---------- Forwarded message ----------
> Date: Sat, 12 Jul 2003 10:19:57 +0100
> From: Stephen Colebourne <sc...@btopenworld.com>
> Reply-To: Jakarta Commons Developers List <co...@jakarta.apache.org>
> To: Jakarta Commons Developers List <co...@jakarta.apache.org>
> Subject: [lang] Pre 2.0 - StringUtils.isEmpty(),
>    isNotEmpty() and stringsa with somespaces
> 
> This got raised on the users list.
> 
> Since this is a major version release, now would be the time to correct this
> problem. (ie. for 2.0)
> 
> I propose:
> isEmpty() - CHANGE to not trim
> isNotEmpty() - no change
> isEmptyTrimmed - NEW, does what isEmpty used to
> isNotEmptyTrimmed - NEW
> 
> This is an incompatable change for isEmpty(), but seems the best solution,
> as we appear to define 'empty' as "" elsewhere and refer to trimming
> explicitly.
> 
> Stephen
> 
> ----- Original Message -----
> From: "Todd Jonker" <tv...@pobox.com>
> To: "Jakarta Commons Users List" <co...@jakarta.apache.org>
> Sent: Thursday, June 26, 2003 2:10 AM
> Subject: Re: StringUtils.isEmpty(), isNotEmpty() and stringsa with
> somespaces
> 
> 
>> In general, StringUtils seems to munge up the distinctions between null,
>> zero-length, and whitespace-only strings.  Obviously there is no concrete
>> definition of what "empty" means.  This makes the whole suite of related
>> methods hard to learn, IMO.
>> 
>> Personally, I think "empty" should mean "zero-length or whitespace-only",
>> but should not include null.  You can't really say that your box is empty
> if
>> you have no box at all.
>> 
>> Another appropriate term would be "blank".  Perhaps a way to fix all this
>> would be to deprecate the poorly-defined "empty" methods and replace them
>> with well-defined "blank" methods.
>> 
>> .T.
>> 
>> 
>> On 6/25/03 8:13 AM, scolebourne@btopenworld.com wrote:
>> 
>>> I would tend to agree that it is inconsistent. I can't remember if
>>> isNotEmpty() was in release 1.0. If it was then changing it makes me
>>> uncomfortable.
>>> 
>>> Stephen
>>> 
>>>>  from:    =?ISO-8859-1?Q?Reinhard_N=E4gele?=
> <re...@mgm-edv.de>
>>>>  date:    Wed, 25 Jun 2003 11:02:57
>>>>  to:      commons-user@jakarta.apache.org
>>>>  subject: Re: StringUtils.isEmpty(), isNotEmpty() and stringsa with
> some
>>>> spaces
>>>> 
>>>> This behavior is specified in the JavaDocs. However, I also think this
>>>> is not really consistent. Any thoughts?
>>>> 
>>>> Reinhard
>>>> 
>>>> 
>>>> Dmitri Ilyin wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> if is use isEmpty() method of StringUtils with "   " (string with some
>>>>> spaces) and isNotEmpty() with the same string i get true from both
> methods,
>>>>> becouse isEmpty trims the parameter string and isNotEmpty doesn't.
>>>>> 
>>>>> I think it is not OK.
>>>>> 
>>>>> regards
>>>>> 
>>>>> Dmitri


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org