You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Morgan Delagrange <md...@yahoo.com> on 2002/02/28 19:22:31 UTC

[COLLECTIONS] Comparator observations

Hi all,

I have a couple of observations on the new Comparators in Collections.

First, I'm not sure all of these Comparators are generic enough to include
in Collections.  ComparableComparator and ReverseComparator seem to be right
on.  NumericStringComparator, PackageNameComparator, and UrlComparator seem
too specific for Collections though.

It seems that if a Collection or Comparator would seem out of place in the
JDK, it's not a good candidate for Collections.  PackageNameComparator and
UrlComparator in particular seem out of place.  Most comparators are highly
specific, and comparators are also very easy to write.  Consequently, we
should be wary of which Comparators we include.  Bay, would you object to
removing NumericStringComparator, PackageNameComparator and UrlComparator?

Second, none of the Comparators are Serializable.  Shouldn't they be, so
that their corresponding Collections will be serializable?

- Morgan





_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by "Michael A. Smith" <mi...@iammichael.org>.
On Thu, 28 Feb 2002, Morgan Delagrange wrote:
> > hmmm...  anyone else out there have an opinion?
> 
> I do!  :)  I might be won over by NumericStringComparator if it could
> actually parse all Strings as numbers and throw a ClassCastException for
> non-numbers.  Right now it will accept a hodge-podge of Strings and numbers
> combined.  The problem is, how do you know the difference between a dash and
> a minus sign, a decimal point and a period?  And then there are the issues
> of localization, since not everyone represents decimal points etc. the same
> way.  The current implementation assumes that all numbers are non-negative
> integers, which is pretty limiting.

I know you have an opinion.  I was hoping someone else would say 
something as well.  :)

I agree that accepting a mixture of numbers and characters is a bit
confusing and doesn't make much sense to me.  +1 for modifying it to
accept string forms of numbers (integral and floating point, positive and
negative) and sort based on that, with the comparator rejecting any 
strings that are not valid numbers.

I think it'd probably be even better if I could add "1", "-2", "15.8",
"1e1" and have them sort:

-2
1
1e1
15.8

Essentially take the behavior of the BigDecimal class's string 
constructor.  Unfortunately, BigDecimal can't be used for the 
implementation because negative numbers and trailing exponents weren't 
introduced until jdk 1.3, and I think we're still aiming for jdk 1.2 
compatibility.

I haven't had any time recently, otherwise I'd take a stab at 
reimplementing to have that behavior.  I still have other things on my 
list as well (like the dirty flag map implementation).  

regards,
michael




--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by Morgan Delagrange <md...@yahoo.com>.
----- Original Message -----
From: "Michael A. Smith" <mi...@iammichael.org>
To: "Jakarta Commons Developers List" <co...@jakarta.apache.org>;
"Morgan Delagrange" <mo...@apache.org>
Sent: Thursday, February 28, 2002 9:26 PM
Subject: Re: [COLLECTIONS] Comparator observations


> On Thu, 28 Feb 2002, Morgan Delagrange wrote:
> > If UrlComparator and NumericStringComparator are mainly for presentation
> > logic, my inclination would be to leave them out.  IMO once you go into
the
> > pretty printing business, you eventually end up with a ton of classes
with a
> > limited audience.  I'd prefer to stick to Comparator utilities and
> > Comparators that enforce a defensible ordering.  I'm curious what other
> > folks' opinions are; right now I seem to be in the minority, although
it's
> > only a 2-1 minority.  :)
>
> I'm with you on the UrlComparator.  However, I still feel that the
> NumericStringComparator is "generic enough".  While it may be
> "presentation" oriented, it doesn't really ring of the arbitrary
> (application specific) ordering of the components that the UrlComparator
> has, so I don't really have a good reason to exclude it.  Plus, I believe
> it is useful to be able to perform numeric comparisons on a string based
> number.  I guess it's usefulness doesn't necessarily mean it has to be
> included with collections though...
>
> hmmm...  anyone else out there have an opinion?
>
> regards,
> michael

I do!  :)  I might be won over by NumericStringComparator if it could
actually parse all Strings as numbers and throw a ClassCastException for
non-numbers.  Right now it will accept a hodge-podge of Strings and numbers
combined.  The problem is, how do you know the difference between a dash and
a minus sign, a decimal point and a period?  And then there are the issues
of localization, since not everyone represents decimal points etc. the same
way.  The current implementation assumes that all numbers are non-negative
integers, which is pretty limiting.

- Morgan


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by "Michael A. Smith" <mi...@iammichael.org>.
On Thu, 28 Feb 2002, Morgan Delagrange wrote:
> If UrlComparator and NumericStringComparator are mainly for presentation
> logic, my inclination would be to leave them out.  IMO once you go into the
> pretty printing business, you eventually end up with a ton of classes with a
> limited audience.  I'd prefer to stick to Comparator utilities and
> Comparators that enforce a defensible ordering.  I'm curious what other
> folks' opinions are; right now I seem to be in the minority, although it's
> only a 2-1 minority.  :)

I'm with you on the UrlComparator.  However, I still feel that the 
NumericStringComparator is "generic enough".  While it may be 
"presentation" oriented, it doesn't really ring of the arbitrary 
(application specific) ordering of the components that the UrlComparator 
has, so I don't really have a good reason to exclude it.  Plus, I believe 
it is useful to be able to perform numeric comparisons on a string based 
number.  I guess it's usefulness doesn't necessarily mean it has to be 
included with collections though...

hmmm...  anyone else out there have an opinion? 

regards,
michael


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by Morgan Delagrange <md...@yahoo.com>.
If UrlComparator and NumericStringComparator are mainly for presentation
logic, my inclination would be to leave them out.  IMO once you go into the
pretty printing business, you eventually end up with a ton of classes with a
limited audience.  I'd prefer to stick to Comparator utilities and
Comparators that enforce a defensible ordering.  I'm curious what other
folks' opinions are; right now I seem to be in the minority, although it's
only a 2-1 minority.  :)

- Morgan

----- Original Message -----
From: "Henri Yandell" <ba...@generationjava.com>
To: "Jakarta Commons Developers List" <co...@jakarta.apache.org>;
"Morgan Delagrange" <mo...@apache.org>
Sent: Thursday, February 28, 2002 3:36 PM
Subject: Re: [COLLECTIONS] Comparator observations


>
>
> > > Agreed. Should UrlComparator live with Url?
> >
> > > The other two I guess are more
> > > Util stuff? Hard to package them. I solved it by having all my
comparators
> > > in a compare pacakge.
> >
> > How about I just re-add UrlComparator and NumericStringComparator to
Util
> > for now, and we can go from there?
>
> They're both mainly presentation based. I disagree that they're
> application specific, but they are common presentation level components
> rather than generic anywhere components.
>
> UrlComparator doesn't try to fix any UrlStreamHandler issues, but orders
> the Urls in a more usable way.
>
> >
> > > NumericStringComparator however is something I've often seen people
asking
> > > for. It's not a simple implementation (well for the level of people
asking
> > > for), so I think it has a lot of uses. It's all opinion though, so I
> > > dunno.
> >
> > To me, this one does stand out from the other two as potentially
> > appropriate.  What would a class like this be used for?  It still
_sounds_ a
> > little application-specific, but maybe not.
>
> It's presentation specific, though I guess it could be used without. My
> most recent usage was in a JTable. The JTable columns were normally
> numbers, but could be strings, so I used String as the type of the cell.
> Sorting on String is bad as it places 51 before 6. So
> NumericStringComparator came in handy. I originally wrote it due to people
> asking how to do such a thing on a forum and the suggested answers all
> looking horribly heavy.
>
> However, server-side components can still be presentation specific I
> figure. A previous usage was in a jsp page. Maybe the
> Commons...comparators need to be bound to a sorting taglib if we've not
> got one already :)
>
> kHen
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by Henri Yandell <ba...@generationjava.com>.

> > Agreed. Should UrlComparator live with Url?
>
> > The other two I guess are more
> > Util stuff? Hard to package them. I solved it by having all my comparators
> > in a compare pacakge.
>
> How about I just re-add UrlComparator and NumericStringComparator to Util
> for now, and we can go from there?

They're both mainly presentation based. I disagree that they're
application specific, but they are common presentation level components
rather than generic anywhere components.

UrlComparator doesn't try to fix any UrlStreamHandler issues, but orders
the Urls in a more usable way.

>
> > NumericStringComparator however is something I've often seen people asking
> > for. It's not a simple implementation (well for the level of people asking
> > for), so I think it has a lot of uses. It's all opinion though, so I
> > dunno.
>
> To me, this one does stand out from the other two as potentially
> appropriate.  What would a class like this be used for?  It still _sounds_ a
> little application-specific, but maybe not.

It's presentation specific, though I guess it could be used without. My
most recent usage was in a JTable. The JTable columns were normally
numbers, but could be strings, so I used String as the type of the cell.
Sorting on String is bad as it places 51 before 6. So
NumericStringComparator came in handy. I originally wrote it due to people
asking how to do such a thing on a forum and the suggested answers all
looking horribly heavy.

However, server-side components can still be presentation specific I
figure. A previous usage was in a jsp page. Maybe the
Commons...comparators need to be bound to a sorting taglib if we've not
got one already :)

kHen


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by Morgan Delagrange <md...@yahoo.com>.
----- Original Message -----
From: "Henri Yandell" <ba...@generationjava.com>
To: "Jakarta Commons Developers List" <co...@jakarta.apache.org>;
"Morgan Delagrange" <mo...@apache.org>
Sent: Thursday, February 28, 2002 12:48 PM
Subject: Re: [COLLECTIONS] Comparator observations


>
>
> On Thu, 28 Feb 2002, Morgan Delagrange wrote:
>
> > First, I'm not sure all of these Comparators are generic enough to
include
> > in Collections.  ComparableComparator and ReverseComparator seem to be
right
> > on.  NumericStringComparator, PackageNameComparator, and UrlComparator
seem
> > too specific for Collections though.
>
> Agreed. Should UrlComparator live with Url?

I guess.  It seems like a class with limited usefulness, which I tried to
flesh out in my email to Michael from a few minutes ago.

> The other two I guess are more
> Util stuff? Hard to package them. I solved it by having all my comparators
> in a compare pacakge.

How about I just re-add UrlComparator and NumericStringComparator to Util
for now, and we can go from there?

> >
> > It seems that if a Collection or Comparator would seem out of place in
the
> > JDK, it's not a good candidate for Collections.  PackageNameComparator
and
> > UrlComparator in particular seem out of place.  Most comparators are
highly
> > specific, and comparators are also very easy to write.  Consequently, we
> > should be wary of which Comparators we include.  Bay, would you object
to
> > removing NumericStringComparator, PackageNameComparator and
UrlComparator?
>
> PackageNameComparator is pretty specific. I'm happy for this to die.

OK, I won't re-add that to Util unless you say so.

> UrlComparator, it seems a nice thing to have available as a standard
> component. If there were to be a comparator project, then I would
> definitely want that in there.
>
> NumericStringComparator however is something I've often seen people asking
> for. It's not a simple implementation (well for the level of people asking
> for), so I think it has a lot of uses. It's all opinion though, so I
> dunno.

To me, this one does stand out from the other two as potentially
appropriate.  What would a class like this be used for?  It still _sounds_ a
little application-specific, but maybe not.

> >
> > Second, none of the Comparators are Serializable.  Shouldn't they be, so
> > that their corresponding Collections will be serializable?
>
> I didn't realise this was a feature :)

The JavaDocs for Comparator recommend making Comparators serializable unless
there's a specific reason not to.  Good advice.

> How does the serializable
> Comparator affect the Collection being sorted? Or is this for TreeSet etc?
> If so, then making them Serializable makes tons of sense.

Right, you can't serialize a TreeSet (or any Sorted collection) that is
ordered by an unserializable Comparator.  Otherwise there's no way to order
Objects that are added after de-serialization.

> Hen
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by Henri Yandell <ba...@generationjava.com>.

On Thu, 28 Feb 2002, Morgan Delagrange wrote:

> First, I'm not sure all of these Comparators are generic enough to include
> in Collections.  ComparableComparator and ReverseComparator seem to be right
> on.  NumericStringComparator, PackageNameComparator, and UrlComparator seem
> too specific for Collections though.

Agreed. Should UrlComparator live with Url? The other two I guess are more
Util stuff? Hard to package them. I solved it by having all my comparators
in a compare pacakge.

>
> It seems that if a Collection or Comparator would seem out of place in the
> JDK, it's not a good candidate for Collections.  PackageNameComparator and
> UrlComparator in particular seem out of place.  Most comparators are highly
> specific, and comparators are also very easy to write.  Consequently, we
> should be wary of which Comparators we include.  Bay, would you object to
> removing NumericStringComparator, PackageNameComparator and UrlComparator?

PackageNameComparator is pretty specific. I'm happy for this to die.
UrlComparator, it seems a nice thing to have available as a standard
component. If there were to be a comparator project, then I would
definitely want that in there.

NumericStringComparator however is something I've often seen people asking
for. It's not a simple implementation (well for the level of people asking
for), so I think it has a lot of uses. It's all opinion though, so I
dunno.

>
> Second, none of the Comparators are Serializable.  Shouldn't they be, so
> that their corresponding Collections will be serializable?

I didn't realise this was a feature :) How does the serializable
Comparator affect the Collection being sorted? Or is this for TreeSet etc?
If so, then making them Serializable makes tons of sense.

Hen


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by "Michael A. Smith" <mi...@iammichael.org>.
On Thu, 28 Feb 2002, Morgan Delagrange wrote:
> > I'm just curious, but why do you feel they are too specific?  If someone
> > whants to create a TreeSet (or FastTreeSet, or pretty much any ordered
> > set), which contains URLs, a URL comparator might be useful.
> 
> Why would you want an ordered list of URLs?  For display purposes?  Or to
> guarantee that you are not referring to the same file?  If that's the case,
> that's already addressed by URLStreamHandler.equals() which returns "true if
> the two urls are considered equal, ie. they refer to the same fragment in
> the same file."  If this Comparator is trying to overcome deficiencies in
> URLStreamHandler, that should be documented in the class.  Even if it were
> the case, I don't believe a Comparator is the right place to address it.
> 
> Usually, if it makes sense to ascribe order to an Object, that Object
> implements Comparable.  In this particular case, the UrlComparator sorts by
> host, then path, then protocol, then port.  But why that and not protocol,
> host, path, port?  Or why not protocol, host, port, path?  URLs don't really
> have an implicit order, so the UrlComparator is guaranteed to be somewhat
> arbitrary.

Point well taken.  

> My objection is mainly their arbitrary ordering.  They try to sort Objects
> that don't have undeniable order.  That's why I'd rather see us focus on
> Comparators that reverse other comparators (ReverseComparator), or
> comparators that can be used to implement SortedSets or SortedMaps
> (ComparableComparator).  Other possibilities might be classes that let you
> chain Comparators together, or Comparators that truly address oversights in
> the JDK.  Comparators like the Soundex Comparator also make sense, because
> Soundex has an implicit ordering (but Soundex is also very specific, and
> it's more appropriate where it is in Codec.)  I'd rather not see us try to
> ascribe an arbitrary ordering to Objets like URLs and package names.

Again, point taken.  Soundex should definately go with the soundex stuff.  
It is specific to that codec.

> But surely there's quite a distinction in scope between these classes that
> deal with iteration in general and a class that only sorts URLs?

my iteration example was referring to simplicity of implementation, not 
applicability to collections.  :)


regards,
michael


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by Morgan Delagrange <md...@yahoo.com>.
----- Original Message -----
From: "Michael A. Smith" <mi...@iammichael.org>
To: "Jakarta Commons Developers List" <co...@jakarta.apache.org>;
"Morgan Delagrange" <mo...@apache.org>
Sent: Thursday, February 28, 2002 12:46 PM
Subject: Re: [COLLECTIONS] Comparator observations


> On Thu, 28 Feb 2002, Morgan Delagrange wrote:
> > First, I'm not sure all of these Comparators are generic enough to
include
> > in Collections.  ComparableComparator and ReverseComparator seem to be
right
> > on.  NumericStringComparator, PackageNameComparator, and UrlComparator
seem
> > too specific for Collections though.
>
> I'm just curious, but why do you feel they are too specific?  If someone
> whants to create a TreeSet (or FastTreeSet, or pretty much any ordered
> set), which contains URLs, a URL comparator might be useful.

Why would you want an ordered list of URLs?  For display purposes?  Or to
guarantee that you are not referring to the same file?  If that's the case,
that's already addressed by URLStreamHandler.equals() which returns "true if
the two urls are considered equal, ie. they refer to the same fragment in
the same file."  If this Comparator is trying to overcome deficiencies in
URLStreamHandler, that should be documented in the class.  Even if it were
the case, I don't believe a Comparator is the right place to address it.

Usually, if it makes sense to ascribe order to an Object, that Object
implements Comparable.  In this particular case, the UrlComparator sorts by
host, then path, then protocol, then port.  But why that and not protocol,
host, path, port?  Or why not protocol, host, port, path?  URLs don't really
have an implicit order, so the UrlComparator is guaranteed to be somewhat
arbitrary.

> And it
> doesn't seem to make as much sense to have the URL comparator in a net
> component.
>
> > It seems that if a Collection or Comparator would seem out of place in
the
> > JDK, it's not a good candidate for Collections.  PackageNameComparator
and
> > UrlComparator in particular seem out of place.  Most comparators are
highly
> > specific, and comparators are also very easy to write.  Consequently, we
> > should be wary of which Comparators we include.  Bay, would you object
to
> > removing NumericStringComparator, PackageNameComparator and
UrlComparator?
>
> these comparators (by name only) seem generic enough.  They don't depend
> on anything other than what's already in the JDK (String, URL).  I would
> agree that they would be out of place if the comparator was comparing
> things that are not in the JDK though.  More specifically, a comparator
> that depends on something other than stuff in the JDK is probably too
> specific and would be out of place.  I'm just not sure that theese
> comparators are.

My objection is mainly their arbitrary ordering.  They try to sort Objects
that don't have undeniable order.  That's why I'd rather see us focus on
Comparators that reverse other comparators (ReverseComparator), or
comparators that can be used to implement SortedSets or SortedMaps
(ComparableComparator).  Other possibilities might be classes that let you
chain Comparators together, or Comparators that truly address oversights in
the JDK.  Comparators like the Soundex Comparator also make sense, because
Soundex has an implicit ordering (but Soundex is also very specific, and
it's more appropriate where it is in Codec.)  I'd rather not see us try to
ascribe an arbitrary ordering to Objets like URLs and package names.

> I'm not sure that "easy to write" is a great argument for excluding them.
> If it was, we wouldn't include things like ProxyIterator,
> SingletonIterator, or even IteratorEnumeraation.  Having the ability to
> reuse code rather than just redo it (even if it is simple) is one of the
> tenets of the commons project.

But surely there's quite a distinction in scope between these classes that
deal with iteration in general and a class that only sorts URLs?

> > Second, none of the Comparators are Serializable.  Shouldn't they be, so
> > that their corresponding Collections will be serializable?
>
> that's probably a good idea.  :)

Thanks, I thought so too.  :)

>
> michael
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [COLLECTIONS] Comparator observations

Posted by "Michael A. Smith" <mi...@iammichael.org>.
On Thu, 28 Feb 2002, Morgan Delagrange wrote:
> First, I'm not sure all of these Comparators are generic enough to include
> in Collections.  ComparableComparator and ReverseComparator seem to be right
> on.  NumericStringComparator, PackageNameComparator, and UrlComparator seem
> too specific for Collections though.

I'm just curious, but why do you feel they are too specific?  If someone
whants to create a TreeSet (or FastTreeSet, or pretty much any ordered
set), which contains URLs, a URL comparator might be useful.  And it
doesn't seem to make as much sense to have the URL comparator in a net
component.

> It seems that if a Collection or Comparator would seem out of place in the
> JDK, it's not a good candidate for Collections.  PackageNameComparator and
> UrlComparator in particular seem out of place.  Most comparators are highly
> specific, and comparators are also very easy to write.  Consequently, we
> should be wary of which Comparators we include.  Bay, would you object to
> removing NumericStringComparator, PackageNameComparator and UrlComparator?

these comparators (by name only) seem generic enough.  They don't depend 
on anything other than what's already in the JDK (String, URL).  I would 
agree that they would be out of place if the comparator was comparing 
things that are not in the JDK though.  More specifically, a comparator 
that depends on something other than stuff in the JDK is probably too 
specific and would be out of place.  I'm just not sure that theese 
comparators are.

I'm not sure that "easy to write" is a great argument for excluding them.  
If it was, we wouldn't include things like ProxyIterator,
SingletonIterator, or even IteratorEnumeraation.  Having the ability to
reuse code rather than just redo it (even if it is simple) is one of the
tenets of the commons project.

> Second, none of the Comparators are Serializable.  Shouldn't they be, so
> that their corresponding Collections will be serializable?

that's probably a good idea.  :)


michael



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>