You are viewing a plain text version of this content. The canonical link for it is here.

Posted to j-dev@xerces.apache.org by Arnaud Le Hors <le...@us.ibm.com> on 2001/03/13 04:23:05 UTC

Processing Model, and DTDs vs Schemas

Hi again,

As I said in my previous message on Xerces2 there is no such thing as a
standard processing model. This said, there are pieces that are clearly
implied by the various XML specifications and when we implement a
parser, such as Xerces (both 1 and 2), we effectively implement a
specific processing model. In Xerces2 one may change it through some
coding (at least in theory ;-), in Xerces1 it's built-in.

In the processing model we've implemented so far DTDs and Schemas are
exclusive. We either use the DTD or the schema but not both. I think
this is wrong. I believe the processing model expected by most people
(and certainly all the people of the Schema WG I've talked to) is
something like that:

doc scanning -> dtd validation -> namespace binding -> schema validation

Note that it is not excluded that one might actually have several schema
validation phases, everyone of them using a different schema and
producing an infoset that is further augmented.

What we currently have is more something like:

doc scanning -> validation/namespace binding

The namespace binding is handled by the validator and happens either
before or after the validation depending on the type of grammar.

As I said, I think we got the wrong model. People will most likely keep
using DTDs at least for entities and will start using Schemas in
addition to that to get a more powerful grammar. Am I just fantasizing
here? If not, how do we support this?

The major drawback of simply implementing the processing model I
described above is that the more we add components the information has
to go through the slower the engine becomes. On the other hand it's more
powerful too...

This problem is not Xerces2 specific, that's why I'm not tagging this
message as such. However, it may not be worth bothering with this for
Xerces1. For Xerces2, on the other hand, I would surely like our
components to be made so that I could implement such a pipeline (through
a ParserConfiguration object. :-)
-- 
Arnaud  Le Hors - IBM Cupertino, XML Strategy Group

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org

Re: Processing Model, and DTDs vs Schemas

Posted by Andy Clark <an...@apache.org>.

Arnaud Le Hors wrote:
> doc scanning -> dtd validation -> namespace binding -> schema validation

I agree that this is the way that we should do it. I'm half
tempted to start re-writing the validator today with the ideas
that I've been thinking about. And now that we have a parser 
configuration object, it's even easier to plug in a new
experimental component... :)

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org