You are viewing a plain text version of this content. The canonical link for it is here.

Posted to general@jakarta.apache.org by Luis Arias <lu...@elysia.com> on 1999/11/16 19:12:20 UTC

Use of DOM in Ant

Hi,

I've been reading the code in the jakarta-tools worspace to try and get
familiar with the way you work...

I noticed some recent factoring in the Ant classes tending to put related
functionality in various Helpers (ScriptHelper, ProjectHelper, XmlHelper,
etc...).

I was just wondering if you had considered a slightly different architecture
in which instead of converting the build file from xml into basically a
vector of Tasks, you might have :

1. Used an ElementFactory subclass to let Sun's parser create appropiate
concrete task classes on the fly based on some mapping.

2. Used a ProjectBuilder class to iterate around relevant nodes in the build
file's DOM, running necessary tasks as it goes along.

The advantage that I see is that you basically use a single data structure
instead of having to map the build file to some other structure, like a
Vector of Tasks.  This way, if the build file's schema becomes more complex,
we have an architecture that can handle this.  For instance, being able to
define parallel processing or something like that...

In that case the Task abstract class would have to be implement the
necessary DOM interfaces or be a subclass of ElementNode, which of course
exposes a much larger api to potential clients.  Instead of having to define
instance variables for Task attributes, the values could stay where they are
stored in DOM objects but still be referred to via the existing get / set
methods.

This seems to be a more natural approach to me, what are your thoughts here
?

--
Luis Arias
Elysia, Sarl - http://www.elysia.com
13, avenue Morane Saulnier
78140 VELIZY
FRANCE
+33 1 30 70 63 42
+33 1 34 65 36 79 fax
+33 6 14 20 87 93 mobile

Re: Use of DOM in Ant

Posted by James Duncan Davidson <ja...@eng.sun.com>.

> I was just wondering if you had considered a slightly different architecture
> in which instead of converting the build file from xml into basically a
> vector of Tasks, you might have :
> 
> 1. Used an ElementFactory subclass to let Sun's parser create appropiate
> concrete task classes on the fly based on some mapping.
> 
> 2. Used a ProjectBuilder class to iterate around relevant nodes in the build
> file's DOM, running necessary tasks as it goes along.

No. First off, I don't want to tie Ant to any particular parser. As
well, the representation of the project is intended to be something that
can be kept live in memory and run many times.

I feel that it is a mistake to closely tie the object semantics of an
object graph directly to any technology. XML is a great way to express
these things as an editable data format, but I don't want it coloring
how I program objects once they are instantiated and live. To me forcing
the object model to know too much about it's XML storage
formatX-MozX-Mozilla-Status: 0009

Re: Use of DOM in Ant

Posted by Luis Arias <lu...@elysia.com>.

----- Original Message -----
From: James Duncan Davidson <ja...@eng.sun.com>
To: <ge...@jakarta.apache.org>
Sent: Wednesday, November 17, 1999 6:43 AM
Subject: Re: Use of DOM in Ant


>
> > I was just wondering if you had considered a slightly different
architecture
> > in which instead of converting the build file from xml into basically a
> > vector of Tasks, you might have :
> >
> > 1. Used an ElementFactory subclass to let Sun's parser create appropiate
> > concrete task classes on the fly based on some mapping.
> >
> > 2. Used a ProjectBuilder class to iterate around relevant nodes in the
build
> > file's DOM, running necessary tasks as it goes along.
>
> No. First off, I don't want to tie Ant to any particular parser. As
> well, the representation of the project is intended to be something that
> can be kept live in memory and run many times.
>
> I feel that it is a mistake to closely tie the object semantics of an
> object graph directly to any technology. XML is a great way to express
> these things as an editable data format, but I don't want it coloring
> how I program objects once they are instantiated and live. To me forcing
> the object model to know too much about it's XML storage format is
> unnatural.
>
> .duncan
>
>

I understand these concerns but please take a look at my answer to costin...
I believe the benefits may overweigh the inconvience in terms of
extensibility and evolution.  I don't have much to say about parser issues
except that we have the source code for Sun's parser...

Thanks !

Re: Use of DOM in Ant

Posted by James Duncan Davidson <ja...@eng.sun.com>.

> I was just wondering if you had considered a slightly different architecture
> in which instead of converting the build file from xml into basically a
> vector of Tasks, you might have :
> 
> 1. Used an ElementFactory subclass to let Sun's parser create appropiate
> concrete task classes on the fly based on some mapping.
> 
> 2. Used a ProjectBuilder class to iterate around relevant nodes in the build
> file's DOM, running necessary tasks as it goes along.

No. First off, I don't want to tie Ant to any particular parser. As
well, the representation of the project is intended to be something that
can be kept live in memory and run many times.

I feel that it is a mistake to closely tie the object semantics of an
object graph directly to any technology. XML is a great way to express
these things as an editable data format, but I don't want it coloring
how I program objects once they are instantiated and live. To me forcing
the object model to know too much about it's XML storage format is
unnatural.

.duncan

Re: Use of DOM in Ant

Posted by Luis Arias <lu...@elysia.com>.

>
> DOM was basically evolved from the Dynamic HTML support in v4 web
> browsers, and isn't "generic" for more than SGML-ish documents.
> It very much _was_ designed to be exposed as an actual representation;
> that's how DHTML works.  You can trace many of the problems folk have
> with it to that DHTML (browser) heritage.
>
> On the other hand, there's no credible alternative data structure API
> for XML data.  There are developers that prefer rolling their own such
> APIs, but even more of them would rather avoid that.  ECS doesn't seem
> to solve the same problem, from what I've seen.  And I'll confess that
> I find the notion of each application having its own hack to supplant
> DOM, rather than starting with DOM, less attractive than dealing with
> certain of the current set of warts.
>

Exactly !  Well at least I have *some* support for my ideas ... :-)
I also recently discovered things like the new DOM traversal api's...  Which
lead me to believe that the correct approach to the problems I mentionned in
my tentative summary may be found in the idea of projecting level - specific
views of the DOM using traversal api's or other upcoming techniques instead
of dealing with impedance mismatching issues using custom application -
specific models...  (Even though I think the instrospection code in ant is a
nice hack !)


>
> Luis Arias asked what APIs XSLT processors use.  I've seen three so
> far:  files as input, SAX events streams as input, and a DOM tree as
> input.  James Clark's implementation works with Sun's DOM -- which I
> designed with reference to then-current XSLT drafts, including many
> features that XSLT needs but DOM L1 doesn't expose -- but also has a
> caveat that DOM input is not as efficient as the other forms.  It's
> an exercise for the reader to find out why that is.  Oracle and
> Microsoft use DOM in their XSLT API, as I recall.  And clearly XSLT
> is designed to work with "documents".
>

Thanks for sharing your insight here...

>
> > XML applications should work with their own specialized internal
> > object models, appropriate for the application's needs.  If they need
> > to provide a DOM-based extensibility interface (something that seems
> > unlikely for ant) that should be done by other objects that implement
> > the DOM interfaces and map the internal structure to DOM.
>
> The reason this seems to me to be unduly harsh is that it came
> completely without justification.  There are certainly cases I'd
> not recommend DOM, but then I can provide reasons for those cases.
> Likewise, there are certainly cases that I _would_ recommend DOM.
>
>

Sure, I also believe in staying very nimble here and choosing the right tool
for the right job...  It's just that there is so much intellectual and
pragmatic energy going into these xml based paradigms that I really don't
see why I would not use them now except for the existence flagrant
non-functional requirements (size, speed, environment, etc...)

Re: Use of DOM in Ant

Posted by Glenn Vanderburg <gv...@delphis.com>.

David Brownell <da...@pacbell.net> writes:
> 
> However I'd not encourage anyone to view that as "the general" case
> for XML data.  You really need to ask if a DOM model is right for
> your particular task.  And when you answer, you need to be wary of
> any biases you may have -- both "for" and "against" DOM!

Sounds like we've actually been in violent agreement.  :-)

-- 
Glenn Vanderburg
Delphi Consultants, LLC
glv@delphis.com

Re: Use of DOM in Ant

Posted by Stefano Mazzocchi <st...@apache.org>.

Luis Arias wrote:
> 
> > However I'd not encourage anyone to view that as "the general" case
> > for XML data.  You really need to ask if a DOM model is right for
> > your particular task.  And when you answer, you need to be wary of
> > any biases you may have -- both "for" and "against" DOM!
> >
> 
> Sure, this sounds level-headed !  And I perfectly agree with that statement.
> Thanks again to everyone for their input on this.  It would be interesting
> to explore these thoughts further but I don't want to bother people on the
> jakarta list any further with this topic...
> 
> Just one last thing, (I hope I'm not getting to sound like Columbo here ...
> :-) Does anyone have any pointers on appropiate lists for these
> architectural type issues (preferably within open-source projects so we can
> get some of these ideas translated into code) ?

What about "general@xml.apache.org"? :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche

Re: Use of DOM in Ant

Posted by Luis Arias <lu...@elysia.com>.

> However I'd not encourage anyone to view that as "the general" case
> for XML data.  You really need to ask if a DOM model is right for
> your particular task.  And when you answer, you need to be wary of
> any biases you may have -- both "for" and "against" DOM!
>

Sure, this sounds level-headed !  And I perfectly agree with that statement.
Thanks again to everyone for their input on this.  It would be interesting
to explore these thoughts further but I don't want to bother people on the
jakarta list any further with this topic...

Just one last thing, (I hope I'm not getting to sound like Columbo here ...
:-) Does anyone have any pointers on appropiate lists for these
architectural type issues (preferably within open-source projects so we can
get some of these ideas translated into code) ?

Cheers !

Re: Use of DOM in Ant

Posted by James Duncan Davidson <ja...@eng.sun.com>.

> p.s. Re Duncan's comment about sometimes feeling "back in C++" when
>         programming DOM ... blame much of that on JavaScript, since
>         IDL can (and does) do a lot better.  Lots of DOM design was
>         done to preserve familiarity for JavaScript folk, over the
>         objections of People Who Know Better.

Ah... Good bit of history to know. Thanks!

-- 
James Davidson                                     duncan@eng.sun.com 
Java + XML / Portable Code + Portable Data                 !try; do()

Re: Use of DOM in Ant

Posted by David Brownell <da...@pacbell.net>.

Luis Arias wrote:
> 
> I came across the SMIL DOM recently, and although it does involve an XML
> "document" type of representation, it is an example of extending the DOM by
> subclassing the interfaces to go to an application-specific object model
> which *preserves* the rich strucutre present in the XML.  This is probably
> the way to go in the general case.  Use layered extensions of the DOM to
> bridge the gap between your application-level concepts and your XML
> representation. 

That's certainly the model implied by the DOM, in quite a few ways.
Sun's DOM, and many/most others, supports it.  (The HTML DOM does
the same sort of stuff:  extending the core of DOM with methods that
are specific to the elements in question.)

However I'd not encourage anyone to view that as "the general" case
for XML data.  You really need to ask if a DOM model is right for
your particular task.  And when you answer, you need to be wary of
any biases you may have -- both "for" and "against" DOM!

- Dave

p.s. Re Duncan's comment about sometimes feeling "back in C++" when
	programming DOM ... blame much of that on JavaScript, since
	IDL can (and does) do a lot better.  Lots of DOM design was
	done to preserve familiarity for JavaScript folk, over the
	objections of People Who Know Better.

Re: Use of DOM in Ant

Posted by Luis Arias <lu...@elysia.com>.

>
> > In this case, as Costin pointed out, the fundamental entities are not
> > document elements; they just happen to be stored and input that way.
> > I feel that it would be a hack to force the application to treat them
> > as document elements throughout.
>
> That thread of argument wasn't in the post to which I replied; I
> could concur to some degree.  There clearly IS a level at which
> the components must strictly conform to the XML data model; the
> question is thus when (and whether) to convert to another data
> structure, perhaps more application-appropriate.
>

I think we would all agree there is some distance between the core DOM and
an application-level object model.  So how to go from this fairly low-level
document-type representation to the higher-level model ?  The solution
implemented in ant, for the reasons which we have seen in the thread, was to
map to an application-level model without conserving any referential
integrity with respect to the XML representation.  This means that it might
be difficult to update the build file, let's say to parse dependencies, but
it sounds like it was a good decision for ant based on its stated focus.
However, in the general case, this might not be a good decision.  It all
depends upon your requirements and your educated guess as to how they will
evolve in the future.

I came across the SMIL DOM recently, and although it does involve an XML
"document" type of representation, it is an example of extending the DOM by
subclassing the interfaces to go to an application-specific object model
which *preserves* the rich strucutre present in the XML.  This is probably
the way to go in the general case.  Use layered extensions of the DOM to
bridge the gap between your application-level concepts and your XML
representation.  I believe the SMIL DOM is a concrete example of this
approach which merits study (emulation ?).

On another thread, I don't see too many visual representations (as in an UML
model) of the codebase, would you be interested in me posting a couple of
gifs on this ?  I might get some neat static class diagrams generated with
my reverse engineering tool (ObjectDomain).   I could try this out on ant
for starters.

Re: Use of DOM in Ant

Posted by Luis Arias <lu...@elysia.com>.

> > [ All of this now being fully irrelevant to ANT now, I'd say ... ]
>
> Pretty much. But since there's philosophical points that affect our use
> of XML, I'll argue in. ;)
>

Ah ! I'm pleased with this comment because I thought that there would be a
lot of discussion on design in these apache lists.  I'm fascinated by how
you all have put this stuff together !  Let's model it and then build it !
(Or build it and then model it, but let's model it !)

> > DOM was designed to be exposed as the in-memory representation
> > of the XML version of data -- how's that?
>
> ..in the context of a web browser. Remember that the original DOM was
> put together by people from the browser manufacturers trying to come up
> with a model by which JavaScript could play with the document in the
> scope of a browser. Yes, a lot of energy has been expended on
> generalization, but at some level, this origin point colors DOM to a
> great degree.
>
> So, when we talk about XML as documents, DOM makes sense to me. When we
> use XML as a universal data format, it doesn't mean as much.
>

Documents are just a specialized view on data.  Beyond the relationships
between the different data elements (as in an ER model) you have the visual
formatting aspects, etc..  But I thought the separation of concerns here had
been taken care of as far as XML is concerned in that you have XML to
represent your data and XSL to encapsulate all the visual aspects.  If this
is true, why not use XML as a universal data format ?


> > the components must strictly conform to the XML data model; the
> > question is thus when (and whether) to convert to another data
> > structure, perhaps more application-appropriate.
>
> Right -- and this is the core of what makes Ant pretty cool for doing
> what it does.. The <taskdefname attrib1="value" attrib2="value"/>
> causing a class to be introspected is where things get interesting. And
> application specific.
>

Yes, that is pretty cool..  :-)

Re: Use of DOM in Ant

Posted by James Duncan Davidson <ja...@eng.sun.com>.

> [ All of this now being fully irrelevant to ANT now, I'd say ... ]

Pretty much. But since there's philosophical points that affect our use
of XML, I'll argue in. ;)

> DOM was designed to be exposed as the in-memory representation
> of the XML version of data -- how's that?

...in the context of a web browser. Remember that the original DOM was
put together by people from the browser manufacturers trying to come up
with a model by which JavaScript could play with the document in the
scope of a browser. Yes, a lot of energy has been expended on
generalization, but at some level, this origin point colors DOM to a
great degree.

So, when we talk about XML as documents, DOM makes sense to me. When we
use XML as a universal data format, it doesn't mean as much.

> the components must strictly conform to the XML data model; the
> question is thus when (and whether) to convert to another data
> structure, perhaps more application-appropriate.

Right -- and this is the core of what makes Ant pretty cool for doing
what it does.. The <taskdefname attrib1="value" attrib2="value"/>
causing a class to be introspected is where things get interesting. And
application specific.

> > > I'd be curious to know what is meant by "really ugly" there.  I have
> > > my own criticisms of DOM, but I usually try to ignore issues of personal
> > > programming style.
> >
> > So do I.  One thing I remember is lots of translating back and forth
> > between NodeLists and Vectors, and NamedNodeMaps and Hashtables.
> 
> When I write code _using_ the DOM, such issues have never once
> come up.  Never.  When implementing it ... maybe.

The IDL generation of DOM really shows through. Everytime I use it, I
feel like I'm half way back in C++ land. 

> That gets to the question I mentioned earlier:  when does one start
> to use an application-appropriate data model.  I've certainly seen
> cases where such conversions benefit from having a DOM version to
> consult, rather than needing to deal with SAX.  And cases where such
> conversions are completely inappropriate.

Right. In Ant we should go straight from SAX to internal data model. The
hack that's there reads in as DOM and translates -- pretty high on the
friction scale. But I think I've alluded that coding at 35000' in a tiny
little seat makes for interesting shortcuts. :)


-- 
James Davidson                                     duncan@eng.sun.com 
Java + XML / Portable Code + Portable Data                 !try; do()

Re: Use of DOM in Ant

Posted by David Brownell <da...@pacbell.net>.

[ All of this now being fully irrelevant to ANT now, I'd say ... ]

Glenn Vanderburg wrote:
> 
> (I really don't think this argument should drag on much longer, since
> we seem to be in agreement about what should be done with ant ... but
> I think David Brownell has seriously misunderstood what I was saying,
> and since that's probably my fault, I want to clarify.)
> 
> > It very much _was_ designed to be exposed as an actual representation;
> > that's how DHTML works.
> 
> Really?  Applications that implement DHTML use DOM as the fundamental
> internal representation of the documents?  I don't believe this is
> true in general.  It's certainly one possible implementation strategy,
> but not the only one.

I guess I'd say that declaring one of N representations as the
"actual" or "fundamental" representation is a tricky business,
particularly with nontrivial data models and multiple APIs to
them -- complex regardless of whether DOM is in the picture.

DOM was designed to be exposed as the in-memory representation
of the XML version of data -- how's that?

> In this case, as Costin pointed out, the fundamental entities are not
> document elements; they just happen to be stored and input that way.
> I feel that it would be a hack to force the application to treat them
> as document elements throughout.

That thread of argument wasn't in the post to which I replied; I
could concur to some degree.  There clearly IS a level at which
the components must strictly conform to the XML data model; the
question is thus when (and whether) to convert to another data
structure, perhaps more application-appropriate.

> > I'd be curious to know what is meant by "really ugly" there.  I have
> > my own criticisms of DOM, but I usually try to ignore issues of personal
> > programming style.
> 
> So do I.  One thing I remember is lots of translating back and forth
> between NodeLists and Vectors, and NamedNodeMaps and Hashtables.

When I write code _using_ the DOM, such issues have never once
come up.  Never.  When implementing it ... maybe.

> Another thing is that it yields code that seems to the reader to be
> concerned with nodes and children, when actually (in this case) it's
> actually working with tasks and dependencies.

That gets to the question I mentioned earlier:  when does one start
to use an application-appropriate data model.  I've certainly seen
cases where such conversions benefit from having a DOM version to
consult, rather than needing to deal with SAX.  And cases where such
conversions are completely inappropriate.

> > > XML applications should work with their own specialized internal
> > > object models, appropriate for the application's needs.  If they need
> > > to provide a DOM-based extensibility interface (something that seems
> > > unlikely for ant) that should be done by other objects that implement
> > > the DOM interfaces and map the internal structure to DOM.
> >
> > The reason this seems to me to be unduly harsh is that it came
> > completely without justification.
> 
> Actually, it seems unduly harsh to me, too, but for a different
> reason: it came without qualification.

Qualifications without justifications are not very useful either! :-)

- Dave

Re: Use of DOM in Ant

Posted by Glenn Vanderburg <gv...@delphis.com>.

(I really don't think this argument should drag on much longer, since
we seem to be in agreement about what should be done with ant ... but
I think David Brownell has seriously misunderstood what I was saying,
and since that's probably my fault, I want to clarify.)

David Brownell <da...@pacbell.net> writes:
>
> Short summary:  Glenn's post is lacking substantiation for his claims,
> so I find it not credible.

:-)  Well, I wasn't trying to present some holy argument from
authority.  I'm presenting my opinions, based on my knowledge of the
technology, some of the discussions about it I saw during the comment
periods on the drafts, and my technical evaluation.

> It very much _was_ designed to be exposed as an actual representation;
> that's how DHTML works.

Really?  Applications that implement DHTML use DOM as the fundamental
internal representation of the documents?  I don't believe this is
true in general.  It's certainly one possible implementation strategy,
but not the only one.

It was designed to be exposed as the *interface* to the actual
representation.  The main crux of the discussion yesterday, I think,
was that Luis didn't realize that you could use a different internal
representation while still exposing a DOM interface.  It's my opinion
that such a strategy is not only possible, but usually the best way to
do things.

> On the other hand, there's no credible alternative data structure API
> for XML data.

Correct.  Nor should there be.  As an application-neutral API for XML
data, DOM is it.

>                                                  And I'll confess that
> I find the notion of each application having its own hack to supplant
> DOM, rather than starting with DOM, less attractive than dealing with
> certain of the current set of warts.

Nobody should try to supplant DOM.  If you have to supply an external
API to your document, use DOM.  But internally you can do something
more appropriate.  That's not a "hack", and it is in no way infringing
on DOM's territory. 

In this case, as Costin pointed out, the fundamental entities are not
document elements; they just happen to be stored and input that way.
I feel that it would be a hack to force the application to treat them
as document elements throughout.

> Luis Arias asked what APIs XSLT processors use.

I interpreted his question to mean the internal representation that
they use.  Providing an interface to a DOM tree is an extremely useful
thing for XSLT processors to do.  Interface and internal
representation are different.

Now, it may well be that DOM makes a good internal representation for
XSLT processors.  But note that the primary job of an XSLT processor
is to manipulate document structures.  

> I'd be curious to know what is meant by "really ugly" there.  I have
> my own criticisms of DOM, but I usually try to ignore issues of personal
> programming style.

So do I.  One thing I remember is lots of translating back and forth
between NodeLists and Vectors, and NamedNodeMaps and Hashtables.

Yes, I understand why NodeList is there, and why it isn't an extension
of Vector.  But nearly all of the reasons apply to the issues of
exposing an interface externally, and they don't apply to the choice
of a private, internal representation.

Another thing is that it yields code that seems to the reader to be
concerned with nodes and children, when actually (in this case) it's
actually working with tasks and dependencies.

> > Using DOM as a predefined in-memory representation of your document
> > can be useful for quick-and-dirty hacks, but it's not a good long-term
> > strategy.
> 
> Would you care to provide some basis for those assertions?

Again: I was not pretending to be bringing down some message from on
high.  It's my opinion.  I apologize if I sounded imperious.

> > XML applications should work with their own specialized internal
> > object models, appropriate for the application's needs.  If they need
> > to provide a DOM-based extensibility interface (something that seems
> > unlikely for ant) that should be done by other objects that implement
> > the DOM interfaces and map the internal structure to DOM.
> 
> The reason this seems to me to be unduly harsh is that it came
> completely without justification.

Actually, it seems unduly harsh to me, too, but for a different
reason: it came without qualification.  There are applications where
DOM is a perfectly reasonable internal representation.  But it's not
right for every case, and the crucial thing is that it doesn't have to
be an exclusive choice; you can choose a specialized internal
representation and still expose DOM if required.  It's not even
difficult. 

-- 
Glenn Vanderburg
Delphi Consultants, LLC
glv@delphis.com

Re: Use of DOM in Ant

Posted by David Brownell <da...@pacbell.net>.

Short summary:  Glenn's post is lacking substantiation for his claims,
so I find it not credible.

Glenn Vanderburg wrote:
> 
> "Luis Arias" <lu...@elysia.com> writes:
> >
> > Of course, all of this depends on one's point of view.  I really believe
> > there are some advantages in using the DOM as the unique data structure for
> > hierarchically organized data in many contexts, and maybe in Ant.
> 
> Let's be clear about what DOM is.  It's a generic extensibility
> interface for applications that work with hierarchically structured
> documents.  It doesn't work particularly well as the actual internal
> representation of those documents, and it was never intended to work
> that way.

DOM was basically evolved from the Dynamic HTML support in v4 web
browsers, and isn't "generic" for more than SGML-ish documents.
It very much _was_ designed to be exposed as an actual representation;
that's how DHTML works.  You can trace many of the problems folk have
with it to that DHTML (browser) heritage.

On the other hand, there's no credible alternative data structure API
for XML data.  There are developers that prefer rolling their own such
APIs, but even more of them would rather avoid that.  ECS doesn't seem
to solve the same problem, from what I've seen.  And I'll confess that
I find the notion of each application having its own hack to supplant
DOM, rather than starting with DOM, less attractive than dealing with
certain of the current set of warts.

Luis Arias asked what APIs XSLT processors use.  I've seen three so
far:  files as input, SAX events streams as input, and a DOM tree as
input.  James Clark's implementation works with Sun's DOM -- which I
designed with reference to then-current XSLT drafts, including many
features that XSLT needs but DOM L1 doesn't expose -- but also has a
caveat that DOM input is not as efficient as the other forms.  It's
an exercise for the reader to find out why that is.  Oracle and
Microsoft use DOM in their XSLT API, as I recall.  And clearly XSLT
is designed to work with "documents".

> There are a lot of cumbersome aspects to the DOM interfaces that are
> there because they wanted to keep it relatively language-independent.
> So it's not very Java-like.  It turns out that if you base your
> application around DOM, the code turns out to be really ugly.  That's
> not a criticism of DOM; it just isn't what DOM was designed for.

I'd be curious to know what is meant by "really ugly" there.  I have
my own criticisms of DOM, but I usually try to ignore issues of personal
programming style.

DOM very much _was_ designed to address parts of applications where
there's a need to enforce the XML data model.

> Using DOM as a predefined in-memory representation of your document
> can be useful for quick-and-dirty hacks, but it's not a good long-term
> strategy.

Would you care to provide some basis for those assertions?

> XML applications should work with their own specialized internal
> object models, appropriate for the application's needs.  If they need
> to provide a DOM-based extensibility interface (something that seems
> unlikely for ant) that should be done by other objects that implement
> the DOM interfaces and map the internal structure to DOM.

The reason this seems to me to be unduly harsh is that it came
completely without justification.  There are certainly cases I'd
not recommend DOM, but then I can provide reasons for those cases.
Likewise, there are certainly cases that I _would_ recommend DOM.

- Dave

Re: Use of DOM in Ant

Posted by Stefano Mazzocchi <st...@apache.org>.

Luis Arias wrote:
> 
> > That's not what I said.  You certainly *can* use it to manipulate your
> > document; you shouldn't feel that you have to.  DOM is a generic
> > interface to documents; it is rarely a natural representation for the
> > data that the documents represent.
> >
> > I assure you that (for example) Mozilla and IE do not represent
> > documents internally as DOM trees, even though they can present DOM
> > interfaces to those documents when required.
> >
> 
> Hmm !  That's interesting ...  Could you explain to me (maybe in private
> mail) how this works ?  I would have suspected this because of performance
> reasons of course, but I wonder if the sort of philosophical design reasons
> we have been discussing here had something to do with that, or simple
> historical code base reasons...

Look, 

I've been dealing with SAX vs. DOM for the last 8 months: SAX is a
simple API but has very hard design decisions to understand. DOM is the
other way around.

People want to "DOMinate" the world by adding, for example, persistency
to DOM to allow you to update your pages by simply adding nodes and the
persistency code takes care for everything.

Cool? No way. Interesting idea, true. But design patterns and API are
different beasts: DOM is a simple way to access to your document data.
Period. Nothing less, nothing more. Nothing fancy for Java nor special
C++ cases. Simple stuff that does it's simple job.

In brief: there are cases where DOM is better, some cases when SAX is
better. Design decisions should tell you when and where. Design
decisions should also tell you when you don't have to care :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche

Re: Use of DOM in Ant

Posted by Luis Arias <lu...@elysia.com>.

> That's not what I said.  You certainly *can* use it to manipulate your
> document; you shouldn't feel that you have to.  DOM is a generic
> interface to documents; it is rarely a natural representation for the
> data that the documents represent.
>
> I assure you that (for example) Mozilla and IE do not represent
> documents internally as DOM trees, even though they can present DOM
> interfaces to those documents when required.
>

Hmm !  That's interesting ...  Could you explain to me (maybe in private
mail) how this works ?  I would have suspected this because of performance
reasons of course, but I wonder if the sort of philosophical design reasons
we have been discussing here had something to do with that, or simple
historical code base reasons...

Re: Use of DOM in Ant

Posted by Glenn Vanderburg <gv...@delphis.com>.

> Hmm..  There seems to be a slight contradiction in your argumentation
> here...  It's really too bad to develop a standard interface for
> hierarchical documents and then not be able to use it to manipulate your
> document because it was not intended to actually represent it faithfully or
> within certain performance constraints in memory !!!

That's not what I said.  You certainly *can* use it to manipulate your
document; you shouldn't feel that you have to.  DOM is a generic
interface to documents; it is rarely a natural representation for the
data that the documents represent.

I assure you that (for example) Mozilla and IE do not represent
documents internally as DOM trees, even though they can present DOM
interfaces to those documents when required.

Many of the people who defined DOM are quite amused that people think
it is appropriate for an application's internal representation.

> I think more and more applications will be based on DOM because it's a
> standard api and XML parsers allow you to bring DOMs in memory and
> manipulate them..  Even if it is a bit ugly...

I think you're right that more applications will be based on DOM, but
I think that for many of those applications it's the wrong decision.

>             maybe someday someone will want to do something even cooler with
> ant and they may have to do some ugly hack to get around that design
> decision..

Nah.  If ant ever needs a DOM interface, someone can just build
adapters --- implementations of the DOM interfaces that map to the
native ant objects.  That's not an ugly hack; it's one of the things
interfaces are for, and it's how DOM was designed to work.

-- 
Glenn Vanderburg
Delphi Consultants, LLC
glv@delphis.com

Re: Use of DOM in Ant

Posted by co...@eng.sun.com.

> I think more and more applications will be based on DOM because it's a
> standard api and XML parsers allow you to bring DOMs in memory and
> manipulate them..  Even if it is a bit ugly...

D in DOM means Document. If you want to represent a document, DOM may
be the ritght object model.
DOM is a standard API - to represent XML.

You either design your application using objects, or use DOM ( or lists !) to
represent all your data. Even in this case, DOM is not a good choice (IMHO)-
you can use a Tree or something that doesn't have all the DOM overhead.

( BTW, I would agree to represent all the data in tuples in 4th NF :-)

Costin

Re: Use of DOM in Ant - A summary ?

Posted by Stefano Mazzocchi <st...@apache.org>.

Ben Laurie wrote:
> 
> costin@eng.sun.com wrote:
> > BTW, try to read a book about relational databases ( Date?)
> 
> Date is the canonical RDBMS book. It also shows, IMO, why you shouldn't
> let academics design computer systems, but I'm several decades too late
> for that observation.

I tend to agree with this :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche

Re: Use of DOM in Ant - A summary ?

Posted by Ben Laurie <be...@algroup.co.uk>.

costin@eng.sun.com wrote:
> BTW, try to read a book about relational databases ( Date?)

Date is the canonical RDBMS book. It also shows, IMO, why you shouldn't
let academics design computer systems, but I'm several decades too late
for that observation.

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi

Re: Use of DOM in Ant - A summary ?

Posted by co...@eng.sun.com.

>
> Did I correctly summarize your thoughts ?  I am of course a bit disappointed
> in discovering this because I had great hopes for the DOM as a sort of
> general design pattern.

Well - the general design pattern is to use objects that represent your
application.
You can represent the attributes using standard bean getter/setter,
same for parent/child ( or any other relation ).

As Stefano said, DOM is more powerful ( you can use it to manipulate
the document, and you can access any element whenever you want).
SAX is harder to use and have some limitations - but it's an excellent
design if you care about speed ( and style :-).

XMLHelper was just a way to avoid writing SAX  - i.e. it creates all the
objects
( in application-space object model) - assuming you follow some simple
patterns ( get/set for attributes, addXXX for child/parent, etc). You can do
the same thing using DOM ( i.e. read an XML and create corresponding
application objects ) - and you can write a "generic" code too. ( right now
the code in ANT is ant-specific, same for the code in Tomcat).

I think it's useful to have this "generic" code ( it'll save a lot of typing
:-),
I'll commit it again in a branch when I'll finish the builds ( including
Duncan's
sugestions :-).


BTW, try to read a book about relational databases ( Date?) - and
about O-R mapping. While I disagree with DOM as "general design pattern",
I still believe that you can represent everything in tuples( relations) - and
get
all the wonderful advantages out of it. It's almost the same discussion ( and
they
haven't reached an agreement yet ).

Costin

Re: Use of DOM in Ant - A summary ?

Posted by Luis Arias <lu...@elysia.com>.

Hello again,

I thought I would post this as a sort of summary of my understanding of the
discussion here because it might be relevant to other projects.

Basically I had suggested using the DOM api's directly or with slight
indirection as an alternative to mapping a build file to an ad-hoc object
model in Ant.

It was pointed out by various participants that although in principle this
would work it would not be clean design for several reasons, but it appears
the most important one has to do with layering of functionality or
separation of concerns:

    The purpose of DOM is to represent XML and not an application-specific
object model.

After thought, I tend now to agree with these observations and I thank the
various participants for their input.  The main reason is that in a general
case, your XML document could be itself highly structured, for instance an
RDF document, so it would be very tedious and probably error-prone to deal
with RDF abstractions using DOM apis.  For each level of abstraction in the
XML document, you would ideally have some kind of mapping to an
abstraction-specific object model and an accompagning api.  Ideally also,
each level would keep some kind of referential integrity to the immediate
sub-level, at least for applications which would modify the XML
representation, since this would allow conservation of ancillary information
at each level (for instance comments).

Did I correctly summarize your thoughts ?  I am of course a bit disappointed
in discovering this because I had great hopes for the DOM as a sort of
general design pattern.

Thanks, again for the excellent discussion.  Let's apply this now ! :-)

Re: Use of DOM in Ant

Posted by Luis Arias <lu...@elysia.com>.

> > So maybe for ant, thinking
> > about hierarchical builds etc.. is too much of a requirement change, but
> > then again, maybe someday someone will want to do something even cooler
with
> > ant and they may have to do some ugly hack to get around that design
> > decision..
>
> Maybe. But I want to see more experience using Ant like it is before we
> go making it too much fancier. I would rather see effort expended on
> creating more Tasks -- expanding what Ant can do -- rather than
> expending it on the internals how those tasks get called.
>

Ok, no prob..  Let me know if I can help and thanks for your views on XML !

Re: Use of DOM in Ant

Posted by James Duncan Davidson <ja...@eng.sun.com>.

> For instance, do you know if XSL processor implementations use the DOM or
> SAX api ?  I really don't know, but I wouldn't be surprised if they used one
> or the other...

When I'm using XSL -- I don't really care what's underneath.. For the
time being, my interface has been `xt foo.xml foo.xsl foo.html`.
Whatever XT does under the covers is fine by me.

> I think more and more applications will be based on DOM because it's a
> standard api and XML parsers allow you to bring DOMs in memory and
> manipulate them..  Even if it is a bit ugly...

DOM is about the Document view of XML. There's two camps in XML,
Documents and Data. You can look at this as Server vs. Client and
E-Commerce vs. Publishing. If I'm dealing with a Document in a browser
or a program, DOM might very well be a way to approach it. Maybe not.
Depends on the application. If I'm using XML for raw data, then I want a
smooth approach to translating that into an object graph that I'm
confortable with.

Look at it another way, I don't want to have XML as a storage format
buried in the internals of my program. XML might not be how I want to
represent my object graph in the future. XML is great, but the
simplicity and flexibility of the code is very important as well.

> Oh !  I agree that ant is a fairly simple program and there a design
> decision was obviously made that favors a mapping from DOM to Task objects
> for easy manipulation, understanding, etc... 

Actually I used DOM 'cause I was too lazy at the time to use the SAX
callbacks. Costin's new changes will move the parse cycle to using just
straight SAX which is fine, simplier, probably faster.

> So maybe for ant, thinking
> about hierarchical builds etc.. is too much of a requirement change, but
> then again, maybe someday someone will want to do something even cooler with
> ant and they may have to do some ugly hack to get around that design
> decision..

Maybe. But I want to see more experience using Ant like it is before we
go making it too much fancier. I would rather see effort expended on
creating more Tasks -- expanding what Ant can do -- rather than
expending it on the internals how those tasks get called.


-- 
James Davidson                                     duncan@eng.sun.com 
Java + XML / Portable Code + Portable Data                 !try; do()

Re: Use of DOM in Ant

Posted by Luis Arias <lu...@elysia.com>.

----- Original Message -----
From: Glenn Vanderburg <gv...@delphis.com>
To: <ge...@jakarta.apache.org>
Sent: Wednesday, November 17, 1999 1:30 PM
Subject: Re: Use of DOM in Ant

> "Luis Arias" <lu...@elysia.com> writes:
> >
> > Of course, all of this depends on one's point of view.  I really believe
> > there are some advantages in using the DOM as the unique data structure
for
> > hierarchically organized data in many contexts, and maybe in Ant.
>
> Let's be clear about what DOM is.  It's a generic extensibility
> interface for applications that work with hierarchically structured
> documents.  It doesn't work particularly well as the actual internal
> representation of those documents, and it was never intended to work
> that way.

Hmm..  There seems to be a slight contradiction in your argumentation
here...  It's really too bad to develop a standard interface for
hierarchical documents and then not be able to use it to manipulate your
document because it was not intended to actually represent it faithfully or
within certain performance constraints in memory !!!

For instance, do you know if XSL processor implementations use the DOM or
SAX api ?  I really don't know, but I wouldn't be surprised if they used one
or the other...

>
> There are a lot of cumbersome aspects to the DOM interfaces that are
> there because they wanted to keep it relatively language-independent.
> So it's not very Java-like.  It turns out that if you base your
> application around DOM, the code turns out to be really ugly.  That's
> not a criticism of DOM; it just isn't what DOM was designed for.
> Using DOM as a predefined in-memory representation of your document
> can be useful for quick-and-dirty hacks, but it's not a good long-term
> strategy.
>

I think more and more applications will be based on DOM because it's a
standard api and XML parsers allow you to bring DOMs in memory and
manipulate them..  Even if it is a bit ugly...

> XML applications should work with their own specialized internal
> object models, appropriate for the application's needs.  If they need
> to provide a DOM-based extensibility interface (something that seems
> unlikely for ant) that should be done by other objects that implement
> the DOM interfaces and map the internal structure to DOM.
>

Oh !  I agree that ant is a fairly simple program and there a design
decision was obviously made that favors a mapping from DOM to Task objects
for easy manipulation, understanding, etc...  So maybe for ant, thinking
about hierarchical builds etc.. is too much of a requirement change, but
then again, maybe someday someone will want to do something even cooler with
ant and they may have to do some ugly hack to get around that design
decision..

Re: Use of DOM in Ant

Posted by Glenn Vanderburg <gv...@delphis.com>.

"Luis Arias" <lu...@elysia.com> writes:
> 
> Of course, all of this depends on one's point of view.  I really believe
> there are some advantages in using the DOM as the unique data structure for
> hierarchically organized data in many contexts, and maybe in Ant.

Let's be clear about what DOM is.  It's a generic extensibility
interface for applications that work with hierarchically structured
documents.  It doesn't work particularly well as the actual internal
representation of those documents, and it was never intended to work
that way.

There are a lot of cumbersome aspects to the DOM interfaces that are
there because they wanted to keep it relatively language-independent.
So it's not very Java-like.  It turns out that if you base your
application around DOM, the code turns out to be really ugly.  That's
not a criticism of DOM; it just isn't what DOM was designed for.
Using DOM as a predefined in-memory representation of your document
can be useful for quick-and-dirty hacks, but it's not a good long-term
strategy.

XML applications should work with their own specialized internal
object models, appropriate for the application's needs.  If they need
to provide a DOM-based extensibility interface (something that seems
unlikely for ant) that should be done by other objects that implement
the DOM interfaces and map the internal structure to DOM.

-- 
Glenn Vanderburg
Delphi Consultants, LLC
glv@delphis.com

Re: Use of DOM in Ant

Posted by Luis Arias <lu...@elysia.com>.

----- Original Message -----
From: <co...@eng.sun.com>
To: <ge...@jakarta.apache.org>
Sent: Tuesday, November 16, 1999 9:58 PM
Subject: Re: Use of DOM in Ant

> > I noticed some recent factoring in the Ant classes tending to put
related
> > functionality in various Helpers (ScriptHelper, ProjectHelper,
XmlHelper,
> > etc...).
>
> ProjectHelper was the original code that read the XML.
>
> We separated it in IntrospectionHelper, ScriptHelper and XmlHelper just
> to make it a bit more re-usable and readable, there is almost no new code,
> just methods moved around.
>

Sure !  I saw that and I believe that's a good move...

>
> > I was just wondering if you had considered a slightly different
architecture
> > in which instead of converting the build file from xml into basically a
> > vector of Tasks, you might have :
> >
> > 1. Used an ElementFactory subclass to let Sun's parser create appropiate
> > concrete task classes on the fly based on some mapping.
>
> There are 2 problems with that:
> - it means Task would have to implement ElementEx, which is
> a heavy interface ( or extend an ElementEx base class ). The beauty
> of Ant is that it is easy to create Tasks.
> - ElementFactory is Sun specific, while the original code used only
> SAX and DOM apis. Since we use Sun parser that is not such a big
> problem.
>
> I don't think it's a good ideea to represent everything as Element and
> use DOM as our internal data representation.
>
> Objects and interface have some advantages, and if you use Element
> to represent everything in Ant - or in any other project - you
> loose some of the advantages of object oriented programming.
>
>

Of course, all of this depends on one's point of view.  I really believe
there are some advantages in using the DOM as the unique data structure for
hierarchically organized data in many contexts, and maybe in Ant.  It's sort
of the "DOMinAnt Architecture" if you will ... :-)

But kidding aside, it's true that the ElementEx interface is fairly complex
and from a OOA point of vue Tasks are clearly not DOM elements.  However,
you can easily imagine modelling Task objects as you have done, and instead
of implementing the representation of their attributes as instance
variables, delegating the representation to another object, for instance a
DOM element !

In this case  your Task abstract class would be an implementation of the
Facade design pattern, it would allow the different taskdef subclasses to
deal with "task-like" concerns, while the Task class itself would handle
services like setAttribute, getAttribute, etc...  The Task subclasses would
then use these basic services to get/ set their attributes.

That opens up a lot of possibilities because you conserve the structure of
your build !  So if a build action has subactions or a project has
subprojects etc...  This is the way to go !

> > 2. Used a ProjectBuilder class to iterate around relevant nodes in the
build
> > file's DOM, running necessary tasks as it goes along.
>
> That would mean Tasks are DOM Elements, which is not true. Element
represent
> a part of a XML document and Task represent a build action.
>
> I don't think a build action "extends" a XML node.
>
>

Sure, but see my above proposal...

> > The advantage that I see is that you basically use a single data
structure
> > instead of having to map the build file to some other structure
>
> It's not an advantage. Yes, you can use strings to represent everything
> ( or lists :-), but I prefer OOP.
>

Hmm, let's not waive the advantages of the DOM so quickly...

I believe the DOM is a unique API for handling hierarchical data, and in
particular, data that's represented by XML documenents.  As you know the
range of data here is phenomenal, music, math, multi-media, uml schemas,
etc...  So Strings are not so bad...

The other thing is that there are two parts to OOD.  You have to have your
analysis model right in terms of functionality (Project, Tasks, Dependecies,
etc), then do your design taking into account constraints such as existing
data, code, legacy systems, etc..  So if you have chosen to represent a
project's build as an XML document, in order to take advantage of XMLs
structuring abilities, why not take full advantage of the DOM in the code ?

What do you think ? I would be happy to work on this with you, maybe make
draw-up a tentative design and code a version 0 so we can compare.

Cheers !

Re: Use of DOM in Ant

Posted by co...@eng.sun.com.

> I noticed some recent factoring in the Ant classes tending to put related
> functionality in various Helpers (ScriptHelper, ProjectHelper, XmlHelper,
> etc...).

ProjectHelper was the original code that read the XML.

We separated it in IntrospectionHelper, ScriptHelper and XmlHelper just
to make it a bit more re-usable and readable, there is almost no new code,
just methods moved around.


> I was just wondering if you had considered a slightly different architecture
> in which instead of converting the build file from xml into basically a
> vector of Tasks, you might have :
>
> 1. Used an ElementFactory subclass to let Sun's parser create appropiate
> concrete task classes on the fly based on some mapping.

There are 2 problems with that:
- it means Task would have to implement ElementEx, which is
a heavy interface ( or extend an ElementEx base class ). The beauty
of Ant is that it is easy to create Tasks.
- ElementFactory is Sun specific, while the original code used only
SAX and DOM apis. Since we use Sun parser that is not such a big
problem.

I don't think it's a good ideea to represent everything as Element and
use DOM as our internal data representation.

Objects and interface have some advantages, and if you use Element
to represent everything in Ant - or in any other project - you
loose some of the advantages of object oriented programming.


> 2. Used a ProjectBuilder class to iterate around relevant nodes in the build
> file's DOM, running necessary tasks as it goes along.

That would mean Tasks are DOM Elements, which is not true. Element represent
a part of a XML document and Task represent a build action.

I don't think a build action "extends" a XML node.


> The advantage that I see is that you basically use a single data structure
> instead of having to map the build file to some other structure

It's not an advantage. Yes, you can use strings to represent everything
( or lists :-), but I prefer OOP.


Costin