You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by Paul Prescod <pa...@prescod.net> on 2000/01/14 00:33:54 UTC

Validator

Will it, in future versions, be possible to use Xerces' XML schemas
validator as a separate component? My problem is that I have multiple
validators competing to be my "parser" and as I recall, XML was designed
to allow a clean split.

I understand how the current situation came about because earlier
versions of the Schema specification were not nicely layerable. I was
one of several who fought hard to reverse that situation and I wonder if
the Xerces team has a plan to take advantage of this change by turning
the validator into just another SAX filter.

Those that do the work make the decisions but please consider this a
feature request from a single  user.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"Remember, Ginger Rogers did everything that Fred Astaire did,
but she did it backwards and in high heels."
                                               --Faith Whittlesey

Re: Validator

Posted by Pierpaolo Fumagalli <pi...@apache.org>.
Ted Leung wrote:
> 
> Hi,
> Let me inject some additional input from a Java perspective:
> [...]

Let me add another requirement... As I went on developing StyleBook and
Cocoon, I felt the requirement of having a validator as a stand alone
piece.
When doing XSLT translation, we basically change the DTD of the
document, and then, to double-check what's coming out of XSLT, I would
need to have a validator for a specific DTD/Schema.
The RevalidatingDOM stuff in the old XML4J, then, is not sufficent,
because it re-validates the dom tree against the same DTD.
I'm asking, is it possible to have a validator that reads a DTD/Schema
and validates a Dom/Sax?

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: Validator

Posted by Ted Leung <tw...@sauria.com>.
Hi,

Let me inject some additional input from a Java perspective:

At the moment, the Java validation architeture and the C++ validation
architecture are a little bit out of sync, but not in a huge way.  The Java
validator also directly talks to the scanner and application level API's are
presented with their data by the validator.  This allows us to use one
validator
to provide validation for SAX and DOM.  We are discussing modifications
to the Java validator that would make it a "regular" event handler, which
would correspond roughly to the notion of document handler in the old Java
implementation and the current C++ implementation.  My opinion is that
there is a bunch of stuff which is up in the air, and which we should be
considering / re-considering.  I also think that this mailing list is the
right place
for this discussion.

There are a couple of issues being tossed around in this thread.

1.  Is it possible to make the validator a standalone thing outside
of the parser?  This includes the question of revalidation.  Paul and Scott
are
interested in this for different reasons.

Paul wants to plug a validator in at various points in a SAX-connected
process.  I'd like to understand why he wants to do this.  If we proceed
with making the validator a regular document handler, we are part of the
way to what I think Paul wants.  Today the Java validator still relies on
pools which are off someplace out.  One consequence of this is that it
is hard to make a DTD/Schema cache.  By the time we build that cache, we
should have an "object" which will have all the info it needs (predigested)
to
process a Schema or DTD.  To get what I think Paul wants then involves some
API on the validator that translates between pool id's and strings.

Scott is interested in being able to validate a DOM tree without writing it
out
and re-reading it.  In the 2.0.x version of XML4J we had a
RevalidatingDOMParser
which would allow you to change a DOM tree and call the validator to make
sure
that the changed tree was still valid.  We haven't ported this functionality
forward
to Xerces because we want to do it in a way that also supports Schema.  This
is a goal for us.

2. What is the plan for accessing information in a type-safe way (i.e. how
do I get
the value of a DOM node as a float?).   At the moment, the schema efforts
are
focusing on validity checking -- does a piece of text follow the structural
rules set
out by Schema?   We know that we are going to have to present an application
with
typed data.  This will involve the creation of another API which talks to
the internals
of the validator.  The type-aware versions of SAX and DOM will then talk to
this API
to get typed information.   At the moment, our thinking is to integrate the
type-safe
API with the validation code, since the code that converts data from strings
to floats
will have to do similar work to the code that checks that a string conforms
to float syntax.
I do see the value of having the access API not depend on the validator, but
it seems to me
that the information which the access API needs to do its job is a decent
fraction of the information
that a validator needs to construct.

Ted