You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Henry Story <he...@bblfish.net> on 2004/02/09 22:50:24 UTC
An improvement to AbstractHashedMap, and a new HashedSet class
Hi,
A little background to this contribution is in order. I was looking at
an interesting framework for Value Objects written by Dirk Riehle,
available under the lgpl at http://www.jvalue.org/, when I found a
small bug in his code. It was obvious that what was needed was a
WeakHashedSet - a HashSet that would wrap its contents in
WeakReference's. Given the difficulty of extending java.util classes I
opted to look at the apache collections package instead.
I first decided to create a HashedSet class, an equivalent of
java.net.HashSet. The best way to do this I realised, would be to
extend the AbstractHashedMap. The JavaDoc for this class says:
> This class implements all the features necessary for a subclass
hash-based map.
> Key-value entries are stored in instances of the HashEntry class,
which can be
> overridden and replaced. The iterators can similarly be replaced,
without the
> need to replace the KeySet, EntrySet and Values view classes.
But in fact it is not quite as extensible as it could be because the
HashEntry is currently a class, not an interface. Let me explain by
taking the simpler example of rewriting the java.util.HashSet class.
HashSet
map -> HashMap
entries -> HashEntries[]
key -> Object
value -> Object
next -> HashEntry
This picture show that HashSet contains a pointer to a HashMap, which
contains a pointer to an Array of HashEntries. Each HashEntry contains
a key pointer, a value pointer and a pointer to the next HashEntry.
A HashSet uses the HashMap because the keys in a HashMap are unique. As
a result the value pointer in the HashEntry has really no role to play.
The way AbstractHashedMap is written in the apache collection means
that to simply follow the above pattern would waste one pointer for
every HashEntry, that is 4 bytes wasted for every entry.
The solution is simple.
1. I turned the HashEntry inner class into an inner interface of
AbstractHashedMap.
2. I added a few methods to get/set the key and the value. These
methods should be inlined by all modern compilers, so I don't think
there is any efficiency problem here.
3. I then wrote a HashedSet which has an inner class that extends
AbstractHashedMap, but which takes a modified HashEntry class where the
key and the value are the same object. The resulting
net.bblfish.util.HashedSet is thus more memory efficient than
java.util.HashSet.
HashedSet
map -> SpecialisedHashMap
entries -> HashEntries[]
key -> Object
next -> HashEntry
Perhaps it would be worth adding these changes to the library.
If the change to
org.apache.commons.collections.map.AbstractHashedMap.HashEntry is
accepted then it will make it relatively easy to write a WeakHashedSet.
It will be just a matter of extending WeakReference by implementing the
HashEntry methods.
WeakHashedSet
map -> SpeacialWeakHashMap
entries -> WeakHashEntries[] extends WeakReference
implements HashEntry
key -> Object
next -> HashEntry
Henry
I have placed the changes in a tar file at:
http://bblfish.net/tmp/changed.tar
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: new HashedSet + WeakHashedSet classes
Posted by Henry Story <he...@bblfish.net>.
The source is now available online under the src directory as well as
the original
changes.tar file.
http://bblfish.net/java/collections/src
Henry
On 13 Feb 2004, at 10:06, Henry Story wrote:
> I have now finished the two new classes HashedSet and WeakHashedSet. I
> have added some tests for them, though I should add a lot more.
>
> To make this work I needed to make a few changes to AbstractHashedMap.
> These changes had a few repercussions on other classes. But they now
> run all the tests successfully.
>
> They classes with UML diagrams and some explanations are available on
> my web site at:
> http://bblfish.net/java/collections/
>
>
> Please let me know if this is the wrong place to post this.
>
> Thanks,
>
> Henry Story
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org
Re: new HashedSet + WeakHashedSet classes
Posted by Henry Story <he...@bblfish.net>.
The source is now available online under the src directory as well as
the original
changes.tar file.
http://bblfish.net/java/collections/src
Henry
On 13 Feb 2004, at 10:06, Henry Story wrote:
> I have now finished the two new classes HashedSet and WeakHashedSet. I
> have added some tests for them, though I should add a lot more.
>
> To make this work I needed to make a few changes to AbstractHashedMap.
> These changes had a few repercussions on other classes. But they now
> run all the tests successfully.
>
> They classes with UML diagrams and some explanations are available on
> my web site at:
> http://bblfish.net/java/collections/
>
>
> Please let me know if this is the wrong place to post this.
>
> Thanks,
>
> Henry Story
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
new HashedSet + WeakHashedSet classes
Posted by Henry Story <he...@bblfish.net>.
I have now finished the two new classes HashedSet and WeakHashedSet. I
have added some tests for them, though I should add a lot more.
To make this work I needed to make a few changes to AbstractHashedMap.
These changes had a few repercussions on other classes. But they now
run all the tests successfully.
They classes with UML diagrams and some explanations are available on
my web site at:
http://bblfish.net/java/collections/
Please let me know if this is the wrong place to post this.
Thanks,
Henry Story
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org
new HashedSet + WeakHashedSet classes
Posted by Henry Story <he...@bblfish.net>.
I have now finished the two new classes HashedSet and WeakHashedSet. I
have added some tests for them, though I should add a lot more.
To make this work I needed to make a few changes to AbstractHashedMap.
These changes had a few repercussions on other classes. But they now
run all the tests successfully.
They classes with UML diagrams and some explanations are available on
my web site at:
http://bblfish.net/java/collections/
Please let me know if this is the wrong place to post this.
Thanks,
Henry Story
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: new HashedSet + WeakHashedSet classes
Posted by "matthew.hawthorne" <ma...@apache.org>.
Henry Story wrote:
> I have now finished the two new classes HashedSet and WeakHashedSet. I
> have added some tests for them, though I should add a lot more.
>
> To make this work I needed to make a few changes to AbstractHashedMap.
> These changes had a few repercussions on other classes. But they now run
> all the tests successfully.
>
> They classes with UML diagrams and some explanations are available on my
> web site at:
> http://bblfish.net/java/collections/
You don't need to cross post. Since your talking about new classes and
submitting patches, this discussion probably belongs on the developer
list, not the user list. And Stephen is subscribed to the list, so you
shouldn't need to email him directly.
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: An improvement to AbstractHashedMap, and a new HashedSet class
Posted by Henry Story <he...@bblfish.net>.
I am using the collections library from CVS and did not have a problem
with the minor re-factoring. After a few quick cosmetic changes,
everything works fine as far as I can tell.
I have placed a couple of UML diagrams on my web site with the code to
help explain what I am doing. You can find the files at:
http://bblfish.net/java/collections/
The collections.jpg picture is a quick sketch of the current
collections.map class hierarchy. As you can see AbstractHashedMap
contains a collection of HashEntry objects. Each of these have a key
and a value. Various subclasses of AbstractHashedMap implement
different specialisations of hashes, some of which require
specialisations of HashedEntry (eg IdentityMap requires an
IdentityEntry).
This diagram does not show the inner class relationships. For example
HashedEntry is an inner class of AbstractHashedMap.
What I am proposing is really only a very small change. (Even if the
next diagram looks very different)
If you looks at collections-altered.jpg you will see that the only
novelty is that HashEntry is now an interface, with most of the code
that it used to contain now in DefaultHashEntry. I have shown the
packages nested inside one another to make clear the static inner
class/interface relationships. As you see DefaultHashEntry has three
fields, but ReflexiveHashEntry only has two, since it is a data
structure that is used for a set where they map.key = map.value.
The only change needed is the interface change - making HashEntry an
interface.
This allows us to bypass the default implementation of HashEntry.
I just added the HashedSet as an example of how it could come in useful.
On 10 Feb 2004, at 18:57, Stephen Colebourne wrote:
> Unfortunately, as version 3 of [collections] has now been released with
> HashEntry as a class, that cannot be changed to an interface.
>
> I would however, support the addition of a similar AbstractHashedSet
> class
> to [collections], preferably with a default HashedSet implementation.
>
> Stephen
>
>
> ----- Original Message -----
> From: "Henry Story" <he...@bblfish.net>
>
>
>> Hi,
>>
>> A little background to this contribution is in order. I was looking at
>> an interesting framework for Value Objects written by Dirk Riehle,
>> available under the lgpl at http://www.jvalue.org/, when I found a
>> small bug in his code. It was obvious that what was needed was a
>> WeakHashedSet - a HashSet that would wrap its contents in
>> WeakReference's. Given the difficulty of extending java.util classes I
>> opted to look at the apache collections package instead.
>>
>> I first decided to create a HashedSet class, an equivalent of
>> java.net.HashSet. The best way to do this I realised, would be to
>> extend the AbstractHashedMap. The JavaDoc for this class says:
>>
>>> This class implements all the features necessary for a subclass
>> hash-based map.
>>> Key-value entries are stored in instances of the HashEntry class,
>> which can be
>>> overridden and replaced. The iterators can similarly be replaced,
>> without the
>>> need to replace the KeySet, EntrySet and Values view classes.
>>
>> But in fact it is not quite as extensible as it could be because the
>> HashEntry is currently a class, not an interface. Let me explain by
>> taking the simpler example of rewriting the java.util.HashSet class.
>>
>> HashSet
>> map -> HashMap
>> entries -> HashEntries[]
>> key -> Object
>> value -> Object
>> next -> HashEntry
>>
>> This picture show that HashSet contains a pointer to a HashMap, which
>> contains a pointer to an Array of HashEntries. Each HashEntry contains
>> a key pointer, a value pointer and a pointer to the next HashEntry.
>>
>> A HashSet uses the HashMap because the keys in a HashMap are unique.
>> As
>> a result the value pointer in the HashEntry has really no role to
>> play.
>> The way AbstractHashedMap is written in the apache collection means
>> that to simply follow the above pattern would waste one pointer for
>> every HashEntry, that is 4 bytes wasted for every entry.
>>
>> The solution is simple.
>> 1. I turned the HashEntry inner class into an inner interface of
>> AbstractHashedMap.
>> 2. I added a few methods to get/set the key and the value. These
>> methods should be inlined by all modern compilers, so I don't think
>> there is any efficiency problem here.
>> 3. I then wrote a HashedSet which has an inner class that extends
>> AbstractHashedMap, but which takes a modified HashEntry class where
>> the
>> key and the value are the same object. The resulting
>> net.bblfish.util.HashedSet is thus more memory efficient than
>> java.util.HashSet.
>>
>> HashedSet
>> map -> SpecialisedHashMap
>> entries -> HashEntries[]
>> key -> Object
>> next -> HashEntry
>>
>>
>> Perhaps it would be worth adding these changes to the library.
>>
>> If the change to
>> org.apache.commons.collections.map.AbstractHashedMap.HashEntry is
>> accepted then it will make it relatively easy to write a
>> WeakHashedSet.
>> It will be just a matter of extending WeakReference by implementing
>> the
>> HashEntry methods.
>>
>> WeakHashedSet
>> map -> SpeacialWeakHashMap
>> entries -> WeakHashEntries[] extends WeakReference
>> implements HashEntry
>> key -> Object
>> next -> HashEntry
>>
>>
>> Henry
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org
Re: An improvement to AbstractHashedMap, and a new HashedSet class
Posted by Henry Story <he...@bblfish.net>.
I am using the collections library from CVS and did not have a problem
with the minor re-factoring. After a few quick cosmetic changes,
everything works fine as far as I can tell.
I have placed a couple of UML diagrams on my web site with the code to
help explain what I am doing. You can find the files at:
http://bblfish.net/java/collections/
The collections.jpg picture is a quick sketch of the current
collections.map class hierarchy. As you can see AbstractHashedMap
contains a collection of HashEntry objects. Each of these have a key
and a value. Various subclasses of AbstractHashedMap implement
different specialisations of hashes, some of which require
specialisations of HashedEntry (eg IdentityMap requires an
IdentityEntry).
This diagram does not show the inner class relationships. For example
HashedEntry is an inner class of AbstractHashedMap.
What I am proposing is really only a very small change. (Even if the
next diagram looks very different)
If you looks at collections-altered.jpg you will see that the only
novelty is that HashEntry is now an interface, with most of the code
that it used to contain now in DefaultHashEntry. I have shown the
packages nested inside one another to make clear the static inner
class/interface relationships. As you see DefaultHashEntry has three
fields, but ReflexiveHashEntry only has two, since it is a data
structure that is used for a set where they map.key = map.value.
The only change needed is the interface change - making HashEntry an
interface.
This allows us to bypass the default implementation of HashEntry.
I just added the HashedSet as an example of how it could come in useful.
On 10 Feb 2004, at 18:57, Stephen Colebourne wrote:
> Unfortunately, as version 3 of [collections] has now been released with
> HashEntry as a class, that cannot be changed to an interface.
>
> I would however, support the addition of a similar AbstractHashedSet
> class
> to [collections], preferably with a default HashedSet implementation.
>
> Stephen
>
>
> ----- Original Message -----
> From: "Henry Story" <he...@bblfish.net>
>
>
>> Hi,
>>
>> A little background to this contribution is in order. I was looking at
>> an interesting framework for Value Objects written by Dirk Riehle,
>> available under the lgpl at http://www.jvalue.org/, when I found a
>> small bug in his code. It was obvious that what was needed was a
>> WeakHashedSet - a HashSet that would wrap its contents in
>> WeakReference's. Given the difficulty of extending java.util classes I
>> opted to look at the apache collections package instead.
>>
>> I first decided to create a HashedSet class, an equivalent of
>> java.net.HashSet. The best way to do this I realised, would be to
>> extend the AbstractHashedMap. The JavaDoc for this class says:
>>
>>> This class implements all the features necessary for a subclass
>> hash-based map.
>>> Key-value entries are stored in instances of the HashEntry class,
>> which can be
>>> overridden and replaced. The iterators can similarly be replaced,
>> without the
>>> need to replace the KeySet, EntrySet and Values view classes.
>>
>> But in fact it is not quite as extensible as it could be because the
>> HashEntry is currently a class, not an interface. Let me explain by
>> taking the simpler example of rewriting the java.util.HashSet class.
>>
>> HashSet
>> map -> HashMap
>> entries -> HashEntries[]
>> key -> Object
>> value -> Object
>> next -> HashEntry
>>
>> This picture show that HashSet contains a pointer to a HashMap, which
>> contains a pointer to an Array of HashEntries. Each HashEntry contains
>> a key pointer, a value pointer and a pointer to the next HashEntry.
>>
>> A HashSet uses the HashMap because the keys in a HashMap are unique.
>> As
>> a result the value pointer in the HashEntry has really no role to
>> play.
>> The way AbstractHashedMap is written in the apache collection means
>> that to simply follow the above pattern would waste one pointer for
>> every HashEntry, that is 4 bytes wasted for every entry.
>>
>> The solution is simple.
>> 1. I turned the HashEntry inner class into an inner interface of
>> AbstractHashedMap.
>> 2. I added a few methods to get/set the key and the value. These
>> methods should be inlined by all modern compilers, so I don't think
>> there is any efficiency problem here.
>> 3. I then wrote a HashedSet which has an inner class that extends
>> AbstractHashedMap, but which takes a modified HashEntry class where
>> the
>> key and the value are the same object. The resulting
>> net.bblfish.util.HashedSet is thus more memory efficient than
>> java.util.HashSet.
>>
>> HashedSet
>> map -> SpecialisedHashMap
>> entries -> HashEntries[]
>> key -> Object
>> next -> HashEntry
>>
>>
>> Perhaps it would be worth adding these changes to the library.
>>
>> If the change to
>> org.apache.commons.collections.map.AbstractHashedMap.HashEntry is
>> accepted then it will make it relatively easy to write a
>> WeakHashedSet.
>> It will be just a matter of extending WeakReference by implementing
>> the
>> HashEntry methods.
>>
>> WeakHashedSet
>> map -> SpeacialWeakHashMap
>> entries -> WeakHashEntries[] extends WeakReference
>> implements HashEntry
>> key -> Object
>> next -> HashEntry
>>
>>
>> Henry
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: An improvement to AbstractHashedMap, and a new HashedSet class
Posted by Stephen Colebourne <sc...@btopenworld.com>.
Unfortunately, as version 3 of [collections] has now been released with
HashEntry as a class, that cannot be changed to an interface.
I would however, support the addition of a similar AbstractHashedSet class
to [collections], preferably with a default HashedSet implementation.
Stephen
----- Original Message -----
From: "Henry Story" <he...@bblfish.net>
To: <co...@jakarta.apache.org>
Sent: Monday, February 09, 2004 9:50 PM
Subject: An improvement to AbstractHashedMap, and a new HashedSet class
> Hi,
>
> A little background to this contribution is in order. I was looking at
> an interesting framework for Value Objects written by Dirk Riehle,
> available under the lgpl at http://www.jvalue.org/, when I found a
> small bug in his code. It was obvious that what was needed was a
> WeakHashedSet - a HashSet that would wrap its contents in
> WeakReference's. Given the difficulty of extending java.util classes I
> opted to look at the apache collections package instead.
>
> I first decided to create a HashedSet class, an equivalent of
> java.net.HashSet. The best way to do this I realised, would be to
> extend the AbstractHashedMap. The JavaDoc for this class says:
>
> > This class implements all the features necessary for a subclass
> hash-based map.
> > Key-value entries are stored in instances of the HashEntry class,
> which can be
> > overridden and replaced. The iterators can similarly be replaced,
> without the
> > need to replace the KeySet, EntrySet and Values view classes.
>
> But in fact it is not quite as extensible as it could be because the
> HashEntry is currently a class, not an interface. Let me explain by
> taking the simpler example of rewriting the java.util.HashSet class.
>
> HashSet
> map -> HashMap
> entries -> HashEntries[]
> key -> Object
> value -> Object
> next -> HashEntry
>
> This picture show that HashSet contains a pointer to a HashMap, which
> contains a pointer to an Array of HashEntries. Each HashEntry contains
> a key pointer, a value pointer and a pointer to the next HashEntry.
>
> A HashSet uses the HashMap because the keys in a HashMap are unique. As
> a result the value pointer in the HashEntry has really no role to play.
> The way AbstractHashedMap is written in the apache collection means
> that to simply follow the above pattern would waste one pointer for
> every HashEntry, that is 4 bytes wasted for every entry.
>
> The solution is simple.
> 1. I turned the HashEntry inner class into an inner interface of
> AbstractHashedMap.
> 2. I added a few methods to get/set the key and the value. These
> methods should be inlined by all modern compilers, so I don't think
> there is any efficiency problem here.
> 3. I then wrote a HashedSet which has an inner class that extends
> AbstractHashedMap, but which takes a modified HashEntry class where the
> key and the value are the same object. The resulting
> net.bblfish.util.HashedSet is thus more memory efficient than
> java.util.HashSet.
>
> HashedSet
> map -> SpecialisedHashMap
> entries -> HashEntries[]
> key -> Object
> next -> HashEntry
>
>
> Perhaps it would be worth adding these changes to the library.
>
> If the change to
> org.apache.commons.collections.map.AbstractHashedMap.HashEntry is
> accepted then it will make it relatively easy to write a WeakHashedSet.
> It will be just a matter of extending WeakReference by implementing the
> HashEntry methods.
>
> WeakHashedSet
> map -> SpeacialWeakHashMap
> entries -> WeakHashEntries[] extends WeakReference
> implements HashEntry
> key -> Object
> next -> HashEntry
>
>
> Henry
>
> I have placed the changes in a tar file at:
> http://bblfish.net/tmp/changed.tar
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org