You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by Sc...@lotus.com on 2001/03/28 19:05:51 UTC

Preparing to be largely offline for a couple of weeks --> Re: Architectural Change Proposal: Direct DTM

Starting next week, I'm going to be working full throttle on the Direct DTM
changes.  I'm going to need to focus, and get this done as quickly as
possible, and with as few regressions as possible.  This means that bug
fixes that need my personal attention, new code review, etc., will have to
wait until this is over.  Blazingly fast performance and low garbage
collection overhead are my number one priorities, even above bug fixing.
Unless we quickly step up to the next performance level, Xalan will stop
being a viable processor option for many uses, in my opinion.

The main code won't be checked in on the main branch as I go, of course,
because it will be too destabilizing.  I'm considering a branch, but
haven't decided on this... I hate branches.  I think I would really rather
just code it, make sure it is stable and robust, and then check in on the
main branch.  I will check-in non-dependent classes such as the DTM,
DTMIterator, DTM2DOMHelper, etc., so folks can review these if they want.

This means that if anyone has *high priority* requests from me, speak up
now, as I will be largely unresponsive starting next week, and lasting
hopefully no more than three weeks there-after.

-scott


----- Forwarded by Scott Boag/CAM/Lotus on 03/28/01 11:49 AM -----
                                                                                                                   
                    Gary L Peskin                                                                                  
                    <garyp@firste        To:     xalan-dev@xml.apache.org                                          
                    ch.com>              cc:     (bcc: Scott Boag/CAM/Lotus)                                       
                                         Subject:     Re: Architectural Change Proposal:  Direct DTM               
                    03/16/01                                                                                       
                    04:57 AM                                                                                       
                    Please                                                                                         
                    respond to                                                                                     
                    xalan-dev                                                                                      
                                                                                                                   
                                                                                                                   




Scott_Boag@lotus.com wrote:
>
> At the moment Xalan processes it's source tree data via the DOM API, with
> some extensions.  The main problem with this is that a node has to be
> represented as an object with identity, which requires a certain amount
of
> resources.  I believe we've come about to the limit with direct DOM
> processing.
>
> An alternative is an index-based API, i.e. something like:
> dtm.sourcetree.getData(nodeID), sdtm.ourceTree.getNameID(nodeID),
> dtm.getNextSiblingID(nodeID),  dtm.dispatchCharacterEvent(nodeID,
> contentHandler), etc.  Xalan would walk this API directly.
>
> The default implementation for this API would be based on XalanJ1's DTM,
> though there will be some fairly heavy modifications.  The reason that we
> did not bring the DTM into XalanJ2 is that, if you're requesting an
> interface, you need an implementing object.  So, though the DTM was much
> smaller than Stree, traversal was more expensive.  But, if you're just
> returning integer IDs, traversal can be just as fast or faster.  Also, in
> the original DTM, we used the Xerces String table, but this version will
> use a much more efficient approach.
>
> The problem with this is it makes it harder to consume a foreign DOM... a
> table would have to be constructed that mapped IDs to Nodes.  But, since
> this could be done incrementally, this might not really be too bad.  And
it
> may actually make DOM processing faster, because we would end up with
> document order indexes.
>
> How much work would it be to adapt Xalan to this approach?  I think most
of
> the work in Xalan would be fairly mechanical.  None of the interfaces
would
> change, so this should be invisible to calling applications, except that
> things should become much faster and consume less memory.
>
> I'm pretty hot on this and would like to get it done soon... say over the
> next six weeks.
>
> Thoughts?

Scott --

This sounds like a +1 to me from a performance standpoint.  I'm kind of
sad that we're moving away from an OO model and into some old style
hacks to get around the performance issue.  It reminds me of the old
days where we had a "record identifier" in column 1 of our 80 column
cards!  I suppose we'll lose the extensibility benefits of the OO model
but these are exactly what are causing the performance problems.

Hopefully, the structure of the indexed item (array, vector or whatever)
will be well documented so that we can follow it.

Gary