You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Stephen Colebourne <sc...@eurobell.co.uk> on 2002/03/28 23:27:23 UTC

[Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Hi,
As part of the Joda project (www.joda.org) I have developed the following
collections that I would like to propose for inclusion in Apache commons
collection:

1) TreeNode, ArrayTreeNode and TreeIterator
http://www.scolebourne.eurobell.co.uk/JodaTreeProposal.zip
There are no curent unit tests for these classes but I will be happy to
develop them if this design was acceptable.
The design in my proposal is based on that of the swing tree node, where the
node contains a list of children, a parent and a user object. I would be
interested in comments on this design. In particular, should tree node
contain a List of children (as now) or should it extend List. Also is the
user object is a useful feature (it seems to work well in the Joda code)?

I note that a Tree proposal from Kief Morris based on Map was submitted
recently -
http://marc.theaimsgroup.com/?l=jakarta-commons-dev&m=101561962307536&w=2. I
would also be interested in any feedback as to how these two proposals might
relate. (The designs are pretty different)

2) IdentityHashSet
http://www.scolebourne.eurobell.co.uk/JodaIdentityHashSetProposal.zip
There are no curent unit tests for this class but I will be happy to develop
them if this design was acceptable.
I believe Java 1.4 has such a class, thus should Apache commons include it?
This implementation is not at all optimised, but it does work. Feedback
welcome.

Thanks,
Stephen


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Kief Morris <un...@yahoo.com>.
> > Which suggests the right approach to take is to
> leave TreeNode as it is,
> and consider whether to make an implementation of
> Tree that uses TreeNode
> for its guts.
> 
> This was kind of where I was heading. What if the
> keys in your
> implementation became TreeNodes?

I'm not sure I see the point? I think TreeNode ought
to simply replace TreeEntry. A basic implementation of
Tree would just provide a Collection-based interface
(in the general sense of the word, not the Java sense)
for accessing TreeNodes, no use of keys. 

For my application which uses keys, I would subclass
TreeNode to include a key field. The key-based tree
should be an extension of the non-key-based tree (as
James Strachan suggested). But there's no reason to
use keys in the base implementation.

Let me explain why am I using a keyed Map. My
application stores a tree structure in a database, a
hierarchy of topics under which content items can be
posted (think Yahoo directory). When I build my tree,
I use the primary key field from the table as the key,
and I link children and parents based on a references
(i.e. a self-referencing parent_id field in the
table). HTML links for navigating the directory use
this ID to identify the node in the directory, so
being able to do a Tree.get(id) to get the node is
convenient.

I'll start drafting up some code for this.

Kief


__________________________________________________
Do You Yahoo!?
Yahoo! Greetings - send holiday greetings for Easter, Passover
http://greetings.yahoo.com/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Stephen Colebourne <sc...@eurobell.co.uk>.
From: Kief Morris <ki...@bitbull.com>
> Stephen Colebourne typed the following on 12:17 AM 4/7/2002 +0100
> >I have tweaked our two sets of interfaces tonight to produce a
combination
> >that starts to hang together. I don't think I lost any functionality, but
> >they are only the interfaces at the moment. See what you think at
>
> Stephen, my personal situation right now is keeping me from getting into
> this. Why don't you go ahead with submitting TreeNode and related stuff,
> and when I get some more time (probably a few months) I'll dig back into
> this. Sorry to lame out on you.
>
> Kief

OK. I will proceed with implementing the interfaces previously posted. I
will then post the full Tree implementation for wider commons review.

Stephen


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Kief Morris <ki...@bitbull.com>.
Stephen Colebourne typed the following on 12:17 AM 4/7/2002 +0100
>I have tweaked our two sets of interfaces tonight to produce a combination
>that starts to hang together. I don't think I lost any functionality, but
>they are only the interfaces at the moment. See what you think at

Stephen, my personal situation right now is keeping me from getting into 
this. Why don't you go ahead with submitting TreeNode and related stuff, 
and when I get some more time (probably a few months) I'll dig back into 
this. Sorry to lame out on you.

Kief


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Stephen Colebourne <sc...@eurobell.co.uk>.
> >Related to this is the question of hashCode and equals methods. (The
other
> >type of collection is a Set, which needs a hashCode). So far, my view is
to
> >use the == check for equals and the identity hashCode (ie. don't
implement
> >either method).
>
> These sounds better than the alternatives you examined and rejected. Have
> you looked at how the Sun or other commons collections implement these?

List and Map both perform equals() by looping around all their contents
comparing values. As I said before, a value only comparison won't do for
TreeNode because it ignores the structure. (Something different might be
possible for Tree, because the keys enable quick access to the whole tree).

> >...we should really
> >have a better toString() implementation anyway - more collections like -
> >including a list of the children.
>
> Yes, I agree the overhead of reflection isn't appropriate for toString.
Maybe
> cache the string value (list of children), and update it when new children
are
> added? Each node could maintain a string list of its children, and when it
> changes it, trigger its parent to do the same.

I think even that is more than we need, how about
TreeNode[x children,value=y]
as a format for TreeNode's toString()?


I have tweaked our two sets of interfaces tonight to produce a combination
that starts to hang together. I don't think I lost any functionality, but
they are only the interfaces at the moment. See what you think at
http://www.scolebourne.eurobell.co.uk/Tree1.zip

Stephen




--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Kief Morris <ki...@bitbull.com>.
Sorry for the delay in responding, too much going on outside the coding life.

Stephen Colebourne typed the following on 12:49 AM 4/2/2002 +0100
>> I would suggest it would be better to keep the list
>> holding the children better hidden from the user.
>> Children can be added and removed with methods such as
>> TreeNode.addChild() and TreeNode.removeChild(). 

>True, provided you only go down one level. By hiding the created child node,
>you cannot add any children to it, which is a bit of a problem.
...
>> A List or Collection of children can be returned from
>> TreeNode, but should be a defensive copy of the
>> internal list.

>I disagree. A dead copy means less to implement, but means we end up adding
>all the List methods to TreeNode. I would suggest that by using a live list
>of children we enable applications to use the full interface of List to
>manipulate that list of children, such as subList() and retainAll() that
>wouldn't be available otherwise. Having an addChild(TreeNode) method on
>TreeNode may well be a good idea for ease of use however.

OK, I'm still uncomfortable with the design, but I see the reasons for it. Having 
both the internal nuts and bolts interface of TreeNode and the smooth locked-down 
interface of Tree available sounds better and better the more I understand the
practicalities of your approach.

>Related to this is the question of hashCode and equals methods. (The other
>type of collection is a Set, which needs a hashCode). So far, my view is to
>use the == check for equals and the identity hashCode (ie. don't implement
>either method).

These sounds better than the alternatives you examined and rejected. Have 
you looked at how the Sun or other commons collections implement these?

>> A question, in ArrayTreeNode.toString() you use a
>> method called Reflection.getShortenedClassName(),
>> which I don't have anywhere in my system (using JDK
>> 1.3). Where does this come from?
>
>Another of my utilities is a set of reflection utilities. In this case the
>code got the class name minus the package name (which ironically doesn't use
>reflection!). Its available on joda.org if you want it, but we should really
>have a better toString() implementation anyway - more collections like -
>including a list of the children.

Yes, I agree the overhead of reflection isn't appropriate for toString. Maybe
cache the string value (list of children), and update it when new children are
added? Each node could maintain a string list of its children, and when it
changes it, trigger its parent to do the same. 

Kief


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Stephen Colebourne <sc...@eurobell.co.uk>.
> It looks like ArrayTreeNode expects users to access
> and manipulate the children of a node by returning a
> List from getChildren(), which users then use to add,
> delete, and access the child nodes. I don't think this
> follows best practices as seen in the java.util
> collections and other commons collections.

Bear in mind the design was a combination of the swing TreeNode and
MutableTreeNode classes. Merge those two and add collections and you get my
TreeNode.

> I would suggest it would be better to keep the list
> holding the children better hidden from the user.
> Children can be added and removed with methods such as
> TreeNode.addChild() and TreeNode.removeChild(). This
> gives different implementations more flexibility in
> how they store children.

True, provided you only go down one level. By hiding the created child node,
you cannot add any children to it, which is a bit of a problem.

> Where your inner class TreeArrayList now has to check
> that the object passed in implements the appropriate
> interface, instead TreeNode.addChild() could simply
> take that interface as a parameter type and not take
> Object, simplifying the code. TreeArrayList can be
> dispensed with entirely, the methods implemented from
> TreeNode would be able to enforce needed requirements,
> such as setting the parent node.
>
> A List or Collection of children can be returned from
> TreeNode, but should be a defensive copy of the
> internal list.

I disagree. A dead copy means less to implement, but means we end up adding
all the List methods to TreeNode. I would suggest that by using a live list
of children we enable applications to use the full interface of List to
manipulate that list of children, such as subList() and retainAll() that
wouldn't be available otherwise. Having an addChild(TreeNode) method on
TreeNode may well be a good idea for ease of use however.

> It would be useful to add a method
> which returns an Iterator for the children of the
> node.

Don't we get this by having the children as a list? No need to duplicate.

> Also, the current implementation requires manual
> creation of TreeNodes by user code. I would suggest
> having methods which take an Object value parameter,
> and handle TreeNode construction internally. In fact,
> I would make these the default methods, with
> Node-handling methods separately named. e.g.:

This could work, but only if the addChild(Object value) method returned the
TreeNode it had created (otherwise you can't add children to it). But
addChild return TreeNode seems a little odd - maybe the name addChildValue
makes more sense? And my view would be that once you expose TreeNode as a
public interface, the default methods should refer to TreeNode.

> A final suggestion, the base interface TreeNode could
> assume that Collections rather than Lists are used
> internally to store children, to allow a lighter
> implementation for applications which don't care about
> ordering. This would just mean removing the
> getChild(int index) method.

Maybe. The trouble is that we would lose the nice extra methods on List for
manipulations. A ListTreeNode subclass could be written, but there would
then be no guarantees that the children added were ListTreeNodes, and the
getParent method would return TreeNode not ListTreeNode. For my current use
of trees, I only use iterators to access the tree, so a collection would be
fine. In the future, I expect index access to be significant which could be
a hassle.

Related to this is the question of hashCode and equals methods. (The other
type of collection is a Set, which needs a hashCode). So far, my view is to
use the == check for equals and the identity hashCode (ie. don't implement
either method). The alternatives are to return equals/hashcode based on the
value, or to traverse the tree to build a composite equals/hashcode. The
former strikes me as too weak - two nodes are not equal just because their
values are equal, because that ignores the children. The latter could
involve some very lengthy calculations, and doesn't strike me as practical.
This would appear to rule out a Set implementation as an alternative to
List. Given that, it seems to me that an ordered list of children isn't a
big overhead, yet gives many benefits.

> A question, in ArrayTreeNode.toString() you use a
> method called Reflection.getShortenedClassName(),
> which I don't have anywhere in my system (using JDK
> 1.3). Where does this come from?

Another of my utilities is a set of reflection utilities. In this case the
code got the class name minus the package name (which ironically doesn't use
reflection!). Its available on joda.org if you want it, but we should really
have a better toString() implementation anyway - more collections like -
including a list of the children.

I need to grab some sleep now, so I won't go into your API just now.

Stephen




--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Kief Morris <un...@yahoo.com>.
Stephen, I have some questions/suggestions on your
TreeNode implementation.

It looks like ArrayTreeNode expects users to access
and manipulate the children of a node by returning a
List from getChildren(), which users then use to add,
delete, and access the child nodes. I don't think this
follows best practices as seen in the java.util
collections and other commons collections. 

I would suggest it would be better to keep the list
holding the children better hidden from the user.
Children can be added and removed with methods such as
TreeNode.addChild() and TreeNode.removeChild(). This
gives different implementations more flexibility in
how they store children. 

Where your inner class TreeArrayList now has to check
that the object passed in implements the appropriate
interface, instead TreeNode.addChild() could simply
take that interface as a parameter type and not take
Object, simplifying the code. TreeArrayList can be
dispensed with entirely, the methods implemented from
TreeNode would be able to enforce needed requirements,
such as setting the parent node.

A List or Collection of children can be returned from
TreeNode, but should be a defensive copy of the
internal list. It would be useful to add a method
which returns an Iterator for the children of the
node.

Also, the current implementation requires manual
creation of TreeNodes by user code. I would suggest
having methods which take an Object value parameter,
and handle TreeNode construction internally. In fact,
I would make these the default methods, with
Node-handling methods separately named. e.g.:

/**
 * Add a value to this node's list of children. A 
 * TreeNode is automatically created and its parent
 * set.
 * @return Whether the child List was changed.
 */
boolean addChild(Object);

/**
 * Add a child Node to this node's list of children.
Its
 * parent is automatically set, if it had a different
parent
 * previously it is removed from the old parent's list
 * of children.
 * @return Whether the child List was changed.
 */
boolean addChildNode(TreeNode.Internal);


A final suggestion, the base interface TreeNode could
assume that Collections rather than Lists are used
internally to store children, to allow a lighter
implementation for applications which don't care about
ordering. This would just mean removing the
getChild(int index) method.

A question, in ArrayTreeNode.toString() you use a
method called Reflection.getShortenedClassName(),
which I don't have anywhere in my system (using JDK
1.3). Where does this come from?

Here's my suggested TreeNode:

public interface TreeNode {

	/**
	 * Get the parent of this node
	 * @return the object that is the parent of this node
	 */
	TreeNode getParent();

        /**
         * Add a value to this node's list of
children. A 
         * TreeNode is automatically created and its
parent
         * set.
         * @return Whether the child List was changed.
         */
	boolean addChild(Object value);

        /**
         * Add a child Node to this node's list of
children. Its
         * parent is automatically set, if it had a
different parent
         * previously it is removed from the old
parent's list
         * of children.
         * @return Whether the child List was changed.
         */
	boolean addChildNode(TreeNode.Internal node);

	/**
	 * Remove the first child whose value equals the
supplied object.
	 * @return Whether the child List was changed.
	 */
	boolean removeChild(Object value);

	/**
	 * Remove a child node from this node's list of
children.
	 * @return Whether the child List was changed.
	 */
	boolean removeChildNode(TreeNode.Internal node);

	/**
	 * Get the child values of this node
	 * @return an iterator for the child node values.
	 */
	Iterator children();

	/**
	 * Get the nodes which are children of this node.
	 * @return an iterator for the child nodes.
	 */
	Iterator childNodes();
    
    /** 
     * Returns a Collection of the child values.
Modifying this
     * collection will not modify the internal
structure of the
     * parent TreeNode.
     */
    Collection getChildCollection();
    
    /**
     * Returns a Collection of the child nodes.
Modifying this
     * collection will not modify the internal
structure of the
     * parent TreeNode unless the parent value is
changed.
     */
    Collection getChildNodeCollection();

	/**
	 * Get a tree iterator from this node, returns
TreeNodes for
     * everything in the subtree below the current
node. Does not
     * include the node itself.
	 * Traversal is depth first.
	 * @return the tree iterator
	 */
	Iterator subtree();

	/**
	 * Is this a root node (has no parent)
	 * @return true if no parent
	 */
	boolean isRoot();

	/**
	 * Is this a leaf node (has no children)
	 * @return true if no children
	 */
	boolean isLeaf();

	/**
	 * Get the user object of this node
	 * @return the user object value
	 */
	Object getValue();
	
	/**
	 * Set the user object of this node
	 * @param object  the new user object value
	 */
	void setValue(Object object);

	/**
	 * TreeNode implementors should implement
TreeNode.Internal
	 * instead of just TreeNode. This interface hides the
public
	 * but unsafe setParent method
	 */	
	public interface Internal extends TreeNode {
    	/**
    	 * Set the parent of this node
    	 * @param node  the new parent node
    	 */
    	void setParent(TreeNode node);
	}
}


__________________________________________________
Do You Yahoo!?
Yahoo! Greetings - send holiday greetings for Easter, Passover
http://greetings.yahoo.com/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Stephen Colebourne <sc...@eurobell.co.uk>.
> > Thus my TreeNode class
> > corresponds most closely
> > to the TreeEntry class of yours, not Tree.
>
> Which suggests the right approach to take is to leave TreeNode as it is,
and consider whether to make an implementation of Tree that uses TreeNode
for its guts.

This was kind of where I was heading. What if the keys in your
implementation became TreeNodes?
public interface Tree extends Map {
// Map methods are used to set/get the values
/** Set value on a node */
 put(node, value)
/** Get value from node */
 get(node)
/** remove from parent */
 remove(node)
/** Get all the tree nodes */
 keySet()
/** Get all the values */
 values()
// additional Tree methods amend the tree structure
/** Get the root node */
 getRootNode()
/** add child */
 add(parentNode, childNode, value)
...
}
Internally, Tree would hold TreeEntry objects. Each TreeEntry would have the
node as the key and the value as the value. Would this style of interface
work for you? (It involves no change to TreeNode)

Stephen



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Kief Morris <ki...@bitbull.com>.
> I realised that the
> difference between the two implementations is that
> you view the tree as a
> single 'flat' collection (which happens to be a
> Map). Tree allows the values
> to be added and removed without ever really knowing
> much about the structure
> of the tree.  In fact, you go further in your
> Javadoc and say that
> applications shouldn't know about the TreeEntry
> class.
> 
> My implementation views the tree as nodes - there is
> no single object
> representing the whole tree. 

Yes, I noticed this difference too, but had been thinking we could reconcile them. As I think on it some more, maybe it's not such an easy thing.

My rationale was to make Tree look and act the same as other collections: users would add and retrieve Objects without having to know anything about how the Tree code maintains the relationships. This is consistent with other collections in commons and the JDK. 

For example, a linked list works very much the way a tree does, but the JDK LinkedList class hides its Entry code in an inner class. Users can't call a next() or prev() method, they have to use the standard List methods, and don't generally know it's any different from an Array backed List.

Once I looked at it this way, using a Map to represent the Tree internally made sense. It keeps the implementation fairly small, using only a single Map, rather than a full List for every node. 

I was thinking that modifying my approach to achieve the functionality of yours would be simple, just remove the requirement for a named key for each node, and use Lists for the internal representation of nodes so order can be preserved. My key, flawed, assumption was that hiding the entry structure from the user was the "right" thing to do, and wouldn't lose any real functionality: given a node, you can easily retrieve its children by using a get(Object parent) method. A node's parent could be gotten with getParent(Object node). The internal code would handle the details, and the user wouldn't suffer any more than they do from not being able to get inside the LinkedList implementation.

The flaw in my thinking is that it assumes each node is a unique object, so internal code can implement those methods with traversals or map lookups. This works for my application, and would probably be useful for others too, but I'm sure some people are going to want to store a single object at more than one place in a tree. But I can't see how to allow this without exposing the internal nodes, because otherwise there is no way to uniquely indentify which occurrence of the Object to get children or a parent for.

> Thus my TreeNode class
> corresponds most closely
> to the TreeEntry class of yours, not Tree. 

Which suggests the right approach to take is to leave TreeNode as it is, and consider whether to make an implementation of Tree that uses TreeNode for its guts.

So, TreeNode should be available for users who need the functionality, and Tree implementations for those who want a Collections-style interface and don't need duplicate node objects. Maybe Tree should have a less generic name?

I still think my internal-Map implementation is more efficient if users won't need to get at the internal node structure, but a TreeNode implementation of Tree would be useful for those who want a container for the Tree but still want to get at nodes.

Kief


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Kief Morris <un...@yahoo.com>.
> I realised that the
> difference between the two implementations is that
> you view the tree as a
> single 'flat' collection (which happens to be a
> Map). Tree allows the values
> to be added and removed without ever really knowing
> much about the structure
> of the tree.  In fact, you go further in your
> Javadoc and say that
> applications shouldn't know about the TreeEntry
> class.
> 
> My implementation views the tree as nodes - there is
> no single object
> representing the whole tree. 

Yes, I noticed this difference too, but had been
thinking we could reconcile them. As I think on it
some more, maybe it's not such an easy thing.

My rationale was to make Tree look and act the same as
other collections: users would add and retrieve
Objects without having to know anything about how the
Tree code maintains the relationships. This is
consistent with other collections in commons and the
JDK. 

For example, a linked list works very much the way a
tree does, but the JDK LinkedList class hides its
Entry code in an inner class. Users can't call a
next() or prev() method, they have to use the standard
List methods, and don't generally know it's any
different from an Array backed List.

Once I looked at it this way, using a Map to represent
the Tree internally made sense. It keeps the
implementation fairly small, using only a single Map,
rather than a full List for every node. 

I was thinking that modifying my approach to achieve
the functionality of yours would be simple, just
remove the requirement for a named key for each node,
and use Lists for the internal representation of nodes
so order can be preserved. My key, flawed, assumption
was that hiding the entry structure from the user was
the "right" thing to do, and wouldn't lose any real
functionality: given a node, you can easily retrieve
its children by using a get(Object parent) method. A
node's parent could be gotten with getParent(Object
node). The internal code would handle the details, and
the user wouldn't suffer any more than they do from
not being able to get inside the LinkedList
implementation.

The flaw in my thinking is that it assumes each node
is a unique object, so internal code can implement
those methods with traversals or map lookups. This
works for my application, and would probably be useful
for others too, but I'm sure some people are going to
want to store a single object at more than one place
in a tree. But I can't see how to allow this without
exposing the internal nodes, because otherwise there
is no way to uniquely indentify which occurrence of
the Object to get children or a parent for.

> Thus my TreeNode class
> corresponds most closely
> to the TreeEntry class of yours, not Tree. 

Which suggests the right approach to take is to leave
TreeNode as it is, and consider whether to make an
implementation of Tree that uses TreeNode for its
guts.

So, TreeNode should be available for users who need
the functionality, and Tree implementations for those
who want a Collections-style interface and don't need
duplicate node objects. Maybe Tree should have a less
generic name?

I still think my internal-Map implementation is more
efficient if users won't need to get at the internal
node structure, but a TreeNode implementation of Tree
would be useful for those who want a container for the
Tree but still want to get at nodes.

Kief



__________________________________________________
Do You Yahoo!?
Yahoo! Greetings - send holiday greetings for Easter, Passover
http://greetings.yahoo.com/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Kief Morris <un...@yahoo.com>.
> I realised that the
> difference between the two implementations is that
> you view the tree as a
> single 'flat' collection (which happens to be a
> Map). Tree allows the values
> to be added and removed without ever really knowing
> much about the structure
> of the tree.  In fact, you go further in your
> Javadoc and say that
> applications shouldn't know about the TreeEntry
> class.
> 
> My implementation views the tree as nodes - there is
> no single object
> representing the whole tree. 

Yes, I noticed this difference too, but had been
thinking we could reconcile them. As I think on it
some more, maybe we're looking at it backwards.

My rationale was to make Tree look and act the same as
other collections: users would add and retrieve
Objects without having to know anything about how the
Tree code maintains the relationships. This is
consistent with other collections in commons and the
JDK. 

For example, a linked list works very much the way a
tree does, but the JDK LinkedList class hides its
Entry code in an inner class. Users can't call a
next() or prev() method, they have to use the standard
List methods, and don't generally know it's any
different from an Array backed List.

Once I looked at it this way, using a Map to represent
the Tree internally made sense. It keeps the
implementation fairly small, using only a single Map,
rather than a full List for every node. 

I was thinking that modifying my approach to achieve
the functionality of yours would be simple, just
remove the requirement for a named key for each node,
and use Lists for the internal representation of nodes
so order can be preserved. My key, flawed, assumption
was that hiding the entry structure from the user was
the "right" thing to do, and wouldn't lose any real
functionality: given a node, you can easily retrieve
its children by using a get(Object parent) method. A
node's parent could be gotten with getParent(Object
node). The internal code would handle the details, and
the user wouldn't suffer any more than they do from
not being able to get inside the LinkedList
implementation.

The flaw in my thinking is that it assumes each node
is a unique object, so internal code can implement
those methods with traversals or map lookups. This
works for my application, and would probably be useful
for others too, but I'm sure some people are going to
want to store a single object at more than one place
in a tree. But I can't see how to allow this without
exposing the internal nodes, because otherwise there
is no way to uniquely indentify which occurrence of
the Object to get children or a parent for.

> Thus my TreeNode class
> corresponds most closely
> to the TreeEntry class of yours, not Tree. 

Which suggests turning around the way we've been
approaching this: let's leave TreeNode as it is, and
I'll scrap the guts of my Tree implementations and
replace them with TreeNode. This way TreeNode is
available for those who need to get their hands on the
nodes, and Tree will offer a Collection-based
interface for those who don't. 

Kief


__________________________________________________
Do You Yahoo!?
Yahoo! Greetings - send holiday greetings for Easter, Passover
http://greetings.yahoo.com/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Stephen Colebourne <sc...@eurobell.co.uk>.
> A rough sketch of the interface, which admittedly follows the architecture
of
> my own implementation more than yours, would be like:
>
> public interface Tree extends Collection {
> /** Throws an exception if the parent isn't in the Tree already. */
> boolean add(Object parent, Object child);
> boolean addToRoot(Object child);
> boolean addAll(Object parent, Collection children);
> Iterator children(Object parent);
> Iterator rootChildren();
> /** All values in the Tree */
> Iterator iterator();
> }
>
> It would need to be fleshed out, but at this level wouldn't assume
anything
> about ordering, so for instance a get(Object parent, int position) method
> would be instead added to a ListTree interface. A SetTree could either
> guarantee that a value appears only once in the Tree, or else would just
> guarantee that it only appears once as a child of any particular node.
> MapTree would map each value to a unique key, unique either for the tree
> or perhaps just for a given node. My current implementation has keys as
> unique across the Tree, but that's too restrictive.

At this point, I was starting to get confused, so I went back and looked at
your initial Tree, TreeEntry and StandardTree classes. I realised that the
difference between the two implementations is that you view the tree as a
single 'flat' collection (which happens to be a Map). Tree allows the values
to be added and removed without ever really knowing much about the structure
of the tree.  In fact, you go further in your Javadoc and say that
applications shouldn't know about the TreeEntry class.

My implementation views the tree as nodes - there is no single object
representing the whole tree. Thus my TreeNode class corresponds most closely
to the TreeEntry class of yours, not Tree. The value must be manually
obtained by the application from the TreeNode class.

I have had a go to see if I could rearrange Tree to be a flat view over
TreeNodes and extend Collection. So far I have failed. It seems to me to be
related to the issue as to whether Map should have extended Collection. It
could have done, but the restriction would have been that all objects in the
collection were Map.Entry.

Tree is more complex. I considered a what-if extrapolated from Map: What if
Tree extended Collection with the restriction that the only objects that
could be added are TreeNodes. Unlike the Map case, this does not solve the
problem, because the 'state' of the tree (parent/children) is held in the
TreeNode being passed in. There is nothing left for the add() method to do.


So, the one thing I don't understand is the desire to use Map and keys as
part of the tree design. Can you explain the reasoning behind this? I
currently think that they exist because you chose not to expose the
TreeEntry class to the application, but I could be wrong.

Stephen


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Kief Morris <ki...@bitbull.com>.
Stephen Colebourne typed the following on 02:59 PM 3/29/2002 +0000
>> We ought to make a common interface for both versions, which would look
>> more like your TreeNode than my Tree, although I suggest we name it Tree.
...
>Makes sense in theory, but the main problem that I foresee is that it is not
>possible to have a single class that implements both Map and
>Collection/List. I don't know how that would affect your implementation.
>Maybe it just means that your implementation won't implement Map, but will
>still have the same methods.

Doh, you're right. I guess it's got to be a Collection. List doesn't seem right,
since all of the items have to be strictly ordered, allowing the user to do a
list.get(int), for example. Map doesn't work for the general case: even we
designed it with methods such as add(parent, child), which looks map-like,
each key of a Map is only supposed to have one value.

>This was one of the questions about my design. The children of a given tree
>node can be held as a property of the node, or the node itself could be a
>list of children. If we did extend Collection, what would add() mean? If it
>means adding children, then TreeNode/Tree should probably extend List not
>Collection.

Well, implementing add() is an optional operation for a Collection, we could
just throw an UnsupportedOperationException if it doesn't fit our design. 
Alternately, add() could just add the value to the root of the Tree. 

A rough sketch of the interface, which admittedly follows the architecture of
my own implementation more than yours, would be like:

public interface Tree extends Collection {

	/** Throws an exception if the parent isn't in the Tree already. */
	boolean add(Object parent, Object child);

	boolean addToRoot(Object child);

	boolean addAll(Object parent, Collection children);

	Iterator children(Object parent);

	Iterator rootChildren();

	/** All values in the Tree */
	Iterator iterator();

}

It would need to be fleshed out, but at this level wouldn't assume anything
about ordering, so for instance a get(Object parent, int position) method
would be instead added to a ListTree interface. A SetTree could either 
guarantee that a value appears only once in the Tree, or else would just
guarantee that it only appears once as a child of any particular node.
MapTree would map each value to a unique key, unique either for the tree
or perhaps just for a given node. My current implementation has keys as
unique across the Tree, but that's too restrictive.

Kief


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by James Strachan <ja...@yahoo.co.uk>.
From: "Stephen Colebourne" <sc...@eurobell.co.uk>
> > We ought to make a common interface for both versions, which would look
> > more like your TreeNode than my Tree, although I suggest we name it
Tree.
> > Then my interface would extend this interface, and be called MapTree or
> > KeyedTree. Your implementation, ArrayTreeNode or maybe just ArrayTree,
> > would be the basic implementation, and my implemention would subclass
> > it. If this turns out to be impractical, we can make an abstract class
> with
> > common code, which both versions would then subclass.
>
> Makes sense in theory, but the main problem that I foresee is that it is
not
> possible to have a single class that implements both Map and
> Collection/List. I don't know how that would affect your implementation.
> Maybe it just means that your implementation won't implement Map, but will
> still have the same methods.

I like the has-a rather than is-a relationship here. e.g.

public interface Tree {
    ...
    Collection getChildren();
}

then a derived Tree interface could offer they keyed children alternative,
while still offering the ability to access the children as a collection...

public interface MapTree extends Tree {
    ...
    Map getChildrenMap();

}

I've not looked at both of your Tree implementations but would this help
unify the two?

James


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Stephen Colebourne <sc...@eurobell.co.uk>.
> Stephen Colebourne typed the following on 10:27 PM 3/28/2002 +0000
> >As part of the Joda project (www.joda.org) I have developed the following
> >collections that I would like to propose for inclusion in Apache commons
> >collection:
> >
> >1) TreeNode, ArrayTreeNode and TreeIterator
> >http://www.scolebourne.eurobell.co.uk/JodaTreeProposal.zip
...
> I've had a look over TreeNode, I think they're not so different that they
> can't be tied together to share some of their interfaces and maybe some
> code. The essential difference is that my version stores objects using
> keys, like a map, while yours keeps them list-style. It makes sense to
> me to have both, since yours is more fundamental, and my key functionality
> is just an extension of the concept.

Sounds good.

> We ought to make a common interface for both versions, which would look
> more like your TreeNode than my Tree, although I suggest we name it Tree.
> Then my interface would extend this interface, and be called MapTree or
> KeyedTree. Your implementation, ArrayTreeNode or maybe just ArrayTree,
> would be the basic implementation, and my implemention would subclass
> it. If this turns out to be impractical, we can make an abstract class
with
> common code, which both versions would then subclass.

Makes sense in theory, but the main problem that I foresee is that it is not
possible to have a single class that implements both Map and
Collection/List. I don't know how that would affect your implementation.
Maybe it just means that your implementation won't implement Map, but will
still have the same methods.

> One question I have about your implementation: is there a reason you
didn't
> implement the java.util.Collection interface? It would make it more
compatible
> with other collections code.

This was one of the questions about my design. The children of a given tree
node can be held as a property of the node, or the node itself could be a
list of children. If we did extend Collection, what would add() mean? If it
means adding children, then TreeNode/Tree should probably extend List not
Collection.

> Once the current commons-collections release is out maybe we can get
> this added so we can hash it out.

Agreed. I think a collections Tree implementation is a good thing.

Stephen
FYI, the class was created to represent beans in a tree like structure so I
could iterate over them.



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Collections] TreeNode, TreeIterator, IdentityHashSet proposed collections

Posted by Kief Morris <ki...@bitbull.com>.
Stephen Colebourne typed the following on 10:27 PM 3/28/2002 +0000
>As part of the Joda project (www.joda.org) I have developed the following
>collections that I would like to propose for inclusion in Apache commons
>collection:
>
>1) TreeNode, ArrayTreeNode and TreeIterator
>http://www.scolebourne.eurobell.co.uk/JodaTreeProposal.zip
...
>I note that a Tree proposal from Kief Morris based on Map was submitted
>recently -
>http://marc.theaimsgroup.com/?l=jakarta-commons-dev&m=101561962307536&w=2. I
>would also be interested in any feedback as to how these two proposals might
>relate. (The designs are pretty different)

I've had a look over TreeNode, I think they're not so different that they
can't be tied together to share some of their interfaces and maybe some
code. The essential difference is that my version stores objects using
keys, like a map, while yours keeps them list-style. It makes sense to
me to have both, since yours is more fundamental, and my key functionality
is just an extension of the concept.

We ought to make a common interface for both versions, which would look
more like your TreeNode than my Tree, although I suggest we name it Tree.
Then my interface would extend this interface, and be called MapTree or 
KeyedTree. Your implementation, ArrayTreeNode or maybe just ArrayTree,
would be the basic implementation, and my implemention would subclass
it. If this turns out to be impractical, we can make an abstract class with 
common code, which both versions would then subclass.

This would also allow us to share common unit test code.

One question I have about your implementation: is there a reason you didn't 
implement the java.util.Collection interface? It would make it more compatible
with other collections code.

Once the current commons-collections release is out maybe we can get
this added so we can hash it out.

Kief


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>