You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Ted Leung <tw...@sauria.com> on 2001/03/13 09:23:57 UTC

Re: [Xerces2] Packages (LONG)

Ok.  Here's my take:

This is important, and we need to do it.  We are constantly getting beat up
over the size of the jars (and we could actually do something about it now,
but
that's another story).   Combine this with the pipeline configuration object
stuff
and it's a no brainer.  All we have to do is decide what goes where and what
the packages are called.  To a first approximation, it doesn't matter to me
nearly
as much what we do as that we do something.  Minimally, I'd like to see all
the
"user-level" apis -- DOM, SAX, HTML DOM, WML DOM, and Serializers
partitioned into separate packages.   Whether or not we deliver them in a
single jar is a problem we can tackle after we can prove that we can cut the
thing
up into pieces and stick'em back together any way we want.  If we can do the
assembly via a config file or even via build time tool, that would be
*swell*, and
would be a cool project that would be ideal for someone new to work on.

As far as donations, I have 2 thoughts.  The first is, lets see if someone
out there
wants to step up an take over some of these -- in particular serializers,
HTML
and WML.   After that we can find some other unsavory way of dealing with
them.
The easiest being, that every piece of code that moves from X1 to X2 has a
committer
who is willing to do the maintenance on that piece.  No maintainer, no
appearance in X2.
The second thought is: this sort of donation policy is the sort of thing
that probably belongs
up one level in general.

Oh, and before you go scaring people by putting (LONG) in the subject,
please make sure
that your message really is.   This one was nothing compared to the one you
got Arnaud
for ;-)

Ted
----- Original Message -----
From: "Andy Clark" <an...@apache.org>
To: <xe...@xml.apache.org>
Sent: Tuesday, March 13, 2001 2:17 PM
Subject: [Xerces2] Packages (LONG)


> A problem that still needs to be addressed is how the Xerces2
> packages are to be arranged. This message is an attempt to
> start a conversation about figuring out how to handle the
> following:
>
>   1) XNI Breakdown
>   2) XNI Extensions
>   3) Xerces Project Contributions
>
> XNI Breakdown
>
> Currently, XNI includes what I call the streaming information
> set interfaces as well as interfaces defining a parser setup.
> The information set interfaces include such things as the
> XMLDocumentHandler, XMLDTDHandler, XMLDTDContentModelHandler,
> and the other supporting interfaces and classes.
>
> Then we have interfaces that define a parser setup, which are
> things like XMLComponent and XMLComponentManager. These are
> needed to construct a parser from a set of components and
> manage the initialization in a uniform way.
>
> In addition to these two interfaces, we've included filter
> interfaces of the various handler interfaces because that is
> what comprises a parser pipeline -- namely, a source, zero
> or more filters, and a target (the final handler in the
> pipeline).
>
> This seems too complicated to be lumped together.
>
> I would like to see XNI broken out into an XNI Core and a
> series of XNI extensions. The Core would only contain the
> basic handler interfaces and supporting interfaces and
> classes. Specifically:
>
>   XMLDocumentHandler
>   XMLDTDHandler, XMLDTDContentModelHandler *
>   QName
>   XMLAttributes
>   XMLString
>
>   * Discussions are happening to redefine the DTD handlers
>     so that they are more useful to users and not just to
>     the parser implementation. See the thread "[XNI] DTD
>     Information Set" for more information and to help in
>     defining these interfaces.
>
> The package would remain the same: org.apache.xerces.xni
>
> The remaining interfaces need to be placed somewhere (if they
> are still deemed necessary). Perhaps they could make up the
> first installment of standard XNI extensions.
>
> XNI Extensions
>
> Beyond the streaming information set, there are a series of
> core components that are extremely useful for defining parser
> pipelines and configurations. Part of this is already defined
> as part of the current XNI package while others would be what
> Arnaud just posted regarding parser configurations. And these
> would not be the only ones that are outside of "Core" because
> there will be further developments and outside contributions.
> So the question is: where do these live?
>
> One suggestion I have is to create an ".ext" package in the
> same way that SAX2 has an extensions package. This would make
> it clear that they are not considered XNI Core. But does this
> single package become the dumping ground for all of the
> current and future XNI extensions? I hope not.
>
> I don't have any good ideas, yet, about how to scale this
> extension mechanism. So I'm open to what everyone else
> thinks in this area. And the answer to this is related to
> how we deal with Xerces project contributions in general.
>
> Xerces Project Contributions
>
> I think that we need a policy about donations. There are a
> lot of things that are up in the air everytime that something
> is donated to the Xerces project. Granted, there haven't been
> many donations recently but I think that Xerces2 gives us a
> great opportunity to establish a policy and then to migrate
> the old submissions to Xerces2 following this policy.
>
> There have been a lot of complaints about the size of the
> Xerces-J binary download and the size of the Jar file. The
> size is partially related to the actual implementation of
> the parser but a lot of the size comes from all of the
> donated (or should I say "dumped"?) code. Many examples
> come to mind: serializers (which are actually needed by a
> large number of people but is included in this list because
> there is no longer any support for this donated code); HTML
> DOM; and WML DOM.
>
> Having a policy in place for contributions would help us to
> separate the build and packaging and also define a clear
> support structure. In other words, I don't think that we
> should accept any code if there is no continued support of
> that code.
>
> Also, people wouldn't have to pay the cost of the added
> code if they never used it. They would just download those
> components that they needed. These are all good things, in
> my opinion.
>
> Before I propose anything, does anyone else have any ideas
> or suggestions?
>
> --
> Andy Clark * IBM, TRL - Japan * andyc@apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Packages (LONG)

Posted by Andy Clark <an...@apache.org>.
Petr Kuzel wrote:
> The selectivity is not necessary. You simply pack no SAX
> class in xerces.jar. Why not simply suppose that SAX is

That's certainly possible and I don't have a real problem
with that. It also keeps us from having to build and rebuild
all of the SAX and DOM source because it doesn't change very
often.

> at runtime at classpath as for example java.net.URL is supposed?
> Of course a distribution containing xerces.jar, sax.jar, dom.jar
> jaxp.jar, jdom.jar ... could exist.

Yep, we can do this.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Packages (LONG)

Posted by Petr Kuzel <Pe...@sun.com>.
Andy Clark wrote:
>
> But the configuration relies on SAX exception classes and
> other things like InputSource. Are you suggesting that we
> selectively package the individual SAX classes needed by
> each part? We can keep our tree cleanly separated (even if
> it's not that way at the moment) but separating SAX by the
> classes that we use is a little inconvenient for build and
> packaging.

The selectivity is not necessary. You simply pack no SAX
class in xerces.jar. Why not simply suppose that SAX is 
at runtime at classpath as for example java.net.URL is supposed? 
Of course a distribution containing xerces.jar, sax.jar, dom.jar
jaxp.jar, jdom.jar ... could exist.

If a user does not place correct version of SAX at classpath
it simply get some sort of ClassNotFoundException or 
NoSuchMehodException. There will be a FAQ ....

  Cc.

-- 
<address>
<a href="mailto:pkuzel@netbeans.com">Petr Kuzel</a>, Sun Microsystems
: <a href="http://www.sun.com/forte/ffj/ie/">Forte Tools</a>
: XML and <a href="http://jini.netbeans.org/">Jini</a> modules</address>

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Packages (LONG)

Posted by Andy Clark <an...@apache.org>.
Petr Kuzel wrote:
> I do not think that a configuration package should contain interface
> APIs such as SAX or DOM. We should think about packaging vs. distribution.

But the configuration relies on SAX exception classes and
other things like InputSource. Are you suggesting that we 
selectively package the individual SAX classes needed by 
each part? We can keep our tree cleanly separated (even if
it's not that way at the moment) but separating SAX by the
classes that we use is a little inconvenient for build and
packaging.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Packages (LONG)

Posted by Petr Kuzel <Pe...@sun.com>.
Andy Clark wrote:

> SAX takes about 55K (debug build, I think) which is small in
> comparison. My point is that we should be able to build and
> package a small set of "common" configurations: all, SAX
> only, DOM only, etc. Obviously we can't handle building and
> packaging every kind of user configuration in the world so
> let's just hit the sweet spot and let the others handle
> their own build and packaging details.

I do not think that a configuration package should contain interface
APIs such as SAX or DOM. We should think about packaging vs. distribution. 
The distribution would consist from set of packages (the implementation, 
SAX, DOM, ...) so users can simply remove all but the implementation if 
they have right versions of others at classpath yet (say at jre/li/ext).

  Cc.

-- 
<address>
<a href="mailto:pkuzel@netbeans.com">Petr Kuzel</a>, Sun Microsystems
: <a href="http://www.sun.com/forte/ffj/ie/">Forte Tools</a>
: XML and <a href="http://jini.netbeans.org/">Jini</a> modules</address>

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Packages (LONG)

Posted by Andy Clark <an...@apache.org>.
Ted Leung wrote:
> Definitely JAXP is a user level API.  SAX and the code for SAX parser
> still take up space.  Some JDOM people might just want the low level

SAX takes about 55K (debug build, I think) which is small in
comparison. My point is that we should be able to build and
package a small set of "common" configurations: all, SAX
only, DOM only, etc. Obviously we can't handle building and
packaging every kind of user configuration in the world so
let's just hit the sweet spot and let the others handle
their own build and packaging details.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Packages (LONG)

Posted by Ted Leung <tw...@sauria.com>.
----- Original Message -----
From: "Andy Clark" <an...@apache.org>
To: <xe...@xml.apache.org>
Sent: Tuesday, March 13, 2001 7:15 PM
Subject: Re: [Xerces2] Packages (LONG)


> Ted Leung wrote:
> > as much what we do as that we do something.  Minimally, I'd like to see
all
> > the
> > "user-level" apis -- DOM, SAX, HTML DOM, WML DOM, and Serializers
> > partitioned into separate packages.   Whether or not we deliver them in
a
>
> I can see everything separated except SAX really because our
> API relies on it. And don't forget to include JAXP as well
> because if you throw that in as required, then you pull in
> DOM as well.
>

Definitely JAXP is a user level API.  SAX and the code for SAX parser
still take up space.  Some JDOM people might just want the low level
engine and use an XNI JDOMbuilder to get what they want.   Also,
if we ever do a JSR-31 data binder, we might have a configuration that
never exposes SAX or DOM.   For a large number of people, this would
be the ideal configuration.

Ted


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Packages (LONG)

Posted by Andy Clark <an...@apache.org>.
Ted Leung wrote:
> as much what we do as that we do something.  Minimally, I'd like to see all
> the
> "user-level" apis -- DOM, SAX, HTML DOM, WML DOM, and Serializers
> partitioned into separate packages.   Whether or not we deliver them in a

I can see everything separated except SAX really because our
API relies on it. And don't forget to include JAXP as well
because if you throw that in as required, then you pull in
DOM as well.

> The easiest being, that every piece of code that moves from X1 to X2 has a
> committer
> who is willing to do the maintenance on that piece.  No maintainer, no
> appearance in X2.

I agree.

> Oh, and before you go scaring people by putting (LONG) in the subject,
> please make sure
> that your message really is.   This one was nothing compared to the one you
> got Arnaud
> for ;-)

Okay. :)

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org