You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Tobias McNulty <tm...@datadesk.com> on 2001/12/12 16:03:56 UTC

Fast Nodes

I am trying to implement a custom Element class to speed access to 
its children.  For example, I have a huge array of long ints, and it 
would be very inefficient to store these all in their own Element 
class as a true child.

So, I want to customize Element and provide functions that access the 
array directly.  One possibility would be implementing the 
NodeIterator interface to make nextNode() return the same node each 
time, but increment an internal iterator in the parent Element so 
that the next time the value of the Element is accessed, it returns 
the next entry in the array.

I heard bits and pieces about "fast nodes" and "Deferred" classes. 
Would these help under this situation?  What exactly are they/do they 
do, and where can I find more information about them?  The only 
documentation I found was very sparse.

Thanks in advance,

Toby
-- 
Tobias McNulty
Data Description, Inc.
840 Hanshaw Road, Suite 9
Ithaca, NY 14850
Phone: (607) 257-1000
E-mail: tmcnulty@datadesk.com
Web: www.datadesk.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


RE: Fast Nodes

Posted by Tobias McNulty <tm...@datadesk.com>.
>Alternatively, check out the tiny, elegant, fast NanoXml at
>http://nanoxml.sourceforge.net/
>No problems extending that.
>

Unfortunately I need something that conforms to the w3c DOM spec that 
I can use with Xalan or some other XSL/XPath transformer.
-- 
Tobias McNulty
Data Description, Inc.
840 Hanshaw Road, Suite 9
Ithaca, NY 14850
Phone: (607) 257-1000
E-mail: tmcnulty@datadesk.com
Web: www.datadesk.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Setting document-class-name

Posted by Tobias McNulty <tm...@datadesk.com>.
>And then just set the document-class-name property on the parser
>so that it uses your document class for making the DOM nodes as
>it parses. But depending on what you want to do, you may have to
>be careful about what you override, etc.

Is there anything special I have to do to help xerces find the class 
I am trying to set as document-class-name?  I set the property 
successfully, but it isn't finding my class (which is, of course, 
part of my applet's jar).  I also made sure to use the fully 
qualified name, e.g. DataDesk.DDDocumentImpl;

Thanks,

Toby
-- 
Tobias McNulty
Data Description, Inc.
840 Hanshaw Road, Suite 9
Ithaca, NY 14850
Phone: (607) 257-1000
E-mail: tmcnulty@datadesk.com
Web: www.datadesk.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Fast Nodes

Posted by Tobias McNulty <tm...@datadesk.com>.
>And then just set the document-class-name property on the parser
>so that it uses your document class for making the DOM nodes as
>it parses. But depending on what you want to do, you may have to
>be careful about what you override, etc.
>

DocumentBuilder.newDocument() doesn't seem to use my custom Document 
class (set using document-class-name), while DocumentBuilder.parse() 
does.  Is it supposed to work this way?

Thanks,

Toby
-- 
Tobias McNulty
Data Description, Inc.
840 Hanshaw Road, Suite 9
Ithaca, NY 14850
Phone: (607) 257-1000
E-mail: tmcnulty@datadesk.com
Web: www.datadesk.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Fast Nodes

Posted by Tobias McNulty <tm...@datadesk.com>.
>I am not clear on where this property is set.  I searched the API a 
>bit and found the DocumentBuilderFactory.setProperty method.  Is 
>this what I'm looking for?  If so, where can I get a list of the 
>valid property names?  I will be interested in overriding the 
>DocumentImpl and ElementImpl classes (probably others as well).

Correction -- that should read setAttribute, not setProperty (though 
you probably knew what I meant anyway).

-- 
Tobias McNulty
Data Description, Inc.
840 Hanshaw Road, Suite 9
Ithaca, NY 14850
Phone: (607) 257-1000
E-mail: tmcnulty@datadesk.com
Web: www.datadesk.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Fast Nodes

Posted by Tobias McNulty <tm...@datadesk.com>.
 From http://xml.apache.org/xerces2-j/properties.html:

" The DocumentBuilderFactory interface contains a 
setAttribute(String,Object) method which may provide a means to set 
features and properties on the underyling parser. However, it cannot 
be relied upon. Therefore, you must use the Xerces DOMParser object 
directly."

Will the setAttribute function always work if Xerces _is_ the 
underlying parser?  It is quite understandable that it wouldn't work 
if Xerces wasn't the underlying parser, but if it is, is there any 
situation under which setAttribute might not work?

Thanks,

Toby


>Hi Andy,
>
>Wow, I was not aware that something like this was possible.  This is 
>just what I am looking for -- and, if Xerces supports my 
>understanding of what you're saying, then by no way is its 
>complexity a bad thing.
>
>I am not clear on where this property is set.  I searched the API a 
>bit and found the DocumentBuilderFactory.setProperty method.  Is 
>this what I'm looking for?  If so, where can I get a list of the 
>valid property names?  I will be interested in overriding the 
>DocumentImpl and ElementImpl classes (probably others as well).

-- 
Tobias McNulty
Data Description, Inc.
840 Hanshaw Road, Suite 9
Ithaca, NY 14850
Phone: (607) 257-1000
E-mail: tmcnulty@datadesk.com
Web: www.datadesk.com

Re: Fast Nodes

Posted by Tobias McNulty <tm...@datadesk.com>.
>And then just set the document-class-name property on the parser
>so that it uses your document class for making the DOM nodes as
>it parses. But depending on what you want to do, you may have to
>be careful about what you override, etc.
>

Hi Andy,

Wow, I was not aware that something like this was possible.  This is 
just what I am looking for -- and, if Xerces supports my 
understanding of what you're saying, then by no way is its complexity 
a bad thing.

I am not clear on where this property is set.  I searched the API a 
bit and found the DocumentBuilderFactory.setProperty method.  Is this 
what I'm looking for?  If so, where can I get a list of the valid 
property names?  I will be interested in overriding the DocumentImpl 
and ElementImpl classes (probably others as well).

Much thanks,

Toby
-- 
Tobias McNulty
Data Description, Inc.
840 Hanshaw Road, Suite 9
Ithaca, NY 14850
Phone: (607) 257-1000
E-mail: tmcnulty@datadesk.com
Web: www.datadesk.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


RE: Fast Nodes

Posted by Craig Collings <cr...@syntax.co.nz>.
This strategy will fail at runtime when:
You parse a document
and attempt to cast an element of that document to MyElement
when that element has not been fully expanded.
It will throw a ClassCastException.
Reason being:
Until it is fully expanded, the element represented is not an instance of
ElementImpl but an instance of DeferredElementImpl.
DeferredElementImpl is not a subclass of ElementImpl and ElementImpl is not
a subclass od ElementImpl. They are in separate heirachies.
Moreover there is no public constructor for DeferredElementImpl or its
superclasses.
Ditto for the *NS heirachies.

On the other hand:
If you construct a DOM from scratch, adding instances of your subclasses to
the model, that will work and serialize just fine. ie your can write but not
read effectively.

Like you I wish this were not so. I will be following this thread with
interest.

-----Original Message-----
From: Andy Clark [mailto:andyc@apache.org]
Sent: Thursday, 13 December 2001 11:09 p.m.
To: xerces-j-user@xml.apache.org
Subject: Re: Fast Nodes


Tobias McNulty wrote:
> So basically there isn't anything I can do to extend Xerces other
> than going into the its source and creating a custom
> createNodeIterator function?

No, there's nothing stopping you from extending the Apache DOM
implementation. And you don't have to deal with all of the
Deferred* node types. For example:

  public class MyElement extends ElementImpl {
    public MyElement(DocumentImpl doc, String name) {
      super(doc, name);
    }
  }

  public class MyDocument extends DocumentImpl {
    public Element createElement(String name) {
      return new MyElement(this, name);
    }
  }

[NOTE: I didn't check this for errors.]

And then just set the document-class-name property on the parser
so that it uses your document class for making the DOM nodes as
it parses. But depending on what you want to do, you may have to
be careful about what you override, etc.

> >help to try subclassing the "deferred" heirarchy. The truth is that
xerces
> >implementations are so tightly dependent on the xerces framework that
> >extension is hardly worth the effort.

That's hardly accurate. But there is a little bit of complexity
introduced for the benefit of memory and runtime performance.
So either we are too simple and slow which people don't like;
or we're fast but too complex which people don't like. Oh well...

--
Andy Clark * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Fast Nodes

Posted by Andy Clark <an...@apache.org>.
Tobias McNulty wrote:
> So basically there isn't anything I can do to extend Xerces other
> than going into the its source and creating a custom
> createNodeIterator function?

No, there's nothing stopping you from extending the Apache DOM
implementation. And you don't have to deal with all of the
Deferred* node types. For example:

  public class MyElement extends ElementImpl {
    public MyElement(DocumentImpl doc, String name) {
      super(doc, name);
    }
  }

  public class MyDocument extends DocumentImpl {
    public Element createElement(String name) {
      return new MyElement(this, name);
    }
  }

[NOTE: I didn't check this for errors.]

And then just set the document-class-name property on the parser 
so that it uses your document class for making the DOM nodes as 
it parses. But depending on what you want to do, you may have to 
be careful about what you override, etc.

> >help to try subclassing the "deferred" heirarchy. The truth is that xerces
> >implementations are so tightly dependent on the xerces framework that
> >extension is hardly worth the effort.

That's hardly accurate. But there is a little bit of complexity
introduced for the benefit of memory and runtime performance.
So either we are too simple and slow which people don't like;
or we're fast but too complex which people don't like. Oh well...

-- 
Andy Clark * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


RE: Fast Nodes

Posted by Tobias McNulty <tm...@datadesk.com>.
So basically there isn't anything I can do to extend Xerces other 
than going into the its source and creating a custom 
createNodeIterator function?

>Yes, I've tried this too. It would very useful to be able to type elements
>correctly in a java environment. It makes the handling easier and safer.
>Unfortunately, the xerces implementations use a complex inheritance
>structure for their elements. The "deferred Node" implementations allow DOM
>nodes not to be fully expanded until they are required. The downside is that
>casts between xerces elements and whatever subclasses of ElementImpl you
>write will fail if that element has not been fully expanded. Nor will it
>help to try subclassing the "deferred" heirarchy. The truth is that xerces
>implementations are so tightly dependent on the xerces framework that
>extension is hardly worth the effort.
>
>Although I haven't really solved this problem to my satisfaction yet, it
>might involve the containment of an org.w3c.dom type in a proxy of some
>sort. That perhaps implies one proxy class for every element type you are
>concerned with, and a graph of types matching a particular schema.
>
>Alternatively, check out the tiny, elegant, fast NanoXml at
>http://nanoxml.sourceforge.net/
>No problems extending that.

-- 
Tobias McNulty
Data Description, Inc.
840 Hanshaw Road, Suite 9
Ithaca, NY 14850
Phone: (607) 257-1000
E-mail: tmcnulty@datadesk.com
Web: www.datadesk.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


RE: Fast Nodes

Posted by Craig Collings <cr...@cabusiness.co.nz>.
Yes, I've tried this too. It would very useful to be able to type elements
correctly in a java environment. It makes the handling easier and safer.
Unfortunately, the xerces implementations use a complex inheritance
structure for their elements. The "deferred Node" implementations allow DOM
nodes not to be fully expanded until they are required. The downside is that
casts between xerces elements and whatever subclasses of ElementImpl you
write will fail if that element has not been fully expanded. Nor will it
help to try subclassing the "deferred" heirarchy. The truth is that xerces
implementations are so tightly dependent on the xerces framework that
extension is hardly worth the effort.

Although I haven't really solved this problem to my satisfaction yet, it
might involve the containment of an org.w3c.dom type in a proxy of some
sort. That perhaps implies one proxy class for every element type you are
concerned with, and a graph of types matching a particular schema.

Alternatively, check out the tiny, elegant, fast NanoXml at
http://nanoxml.sourceforge.net/
No problems extending that.

-----Original Message-----
From: Tobias McNulty [mailto:tmcnulty@datadesk.com]
Sent: Thursday, 13 December 2001 4:04 a.m.
To: xerces-j-user@xml.apache.org
Subject: Fast Nodes


I am trying to implement a custom Element class to speed access to
its children.  For example, I have a huge array of long ints, and it
would be very inefficient to store these all in their own Element
class as a true child.

So, I want to customize Element and provide functions that access the
array directly.  One possibility would be implementing the
NodeIterator interface to make nextNode() return the same node each
time, but increment an internal iterator in the parent Element so
that the next time the value of the Element is accessed, it returns
the next entry in the array.

I heard bits and pieces about "fast nodes" and "Deferred" classes.
Would these help under this situation?  What exactly are they/do they
do, and where can I find more information about them?  The only
documentation I found was very sparse.

Thanks in advance,

Toby
--
Tobias McNulty
Data Description, Inc.
840 Hanshaw Road, Suite 9
Ithaca, NY 14850
Phone: (607) 257-1000
E-mail: tmcnulty@datadesk.com
Web: www.datadesk.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org