You are viewing a plain text version of this content. The canonical link for it is here.

Posted to j-dev@xerces.apache.org by Aleksander Slominski <as...@cs.indiana.edu> on 2001/08/15 22:12:18 UTC

help on implementing pull in X2 [was Re: [Xerces2] Pull Parsing]

Ted Leung wrote:

> Well, let's talk about this how this would work -- You'd have
> to return an object that that had say:
>
> public interface PullParserAPI {
>     public void setInputSource (InputSource);
>     public boolean parseDocument(boolean);
> }
>
> That way you'd define parser configuration, which would have no
> parser API methods.  The usage model then would be to create
> a parser convenience class that implemented PullParserAPI, which
> then used the pull configuration to get the object from the property and
> hook it up as the implementation of the parser convenience class.

hi,

could somebody point me out where in documentation or source code to look for
how to implement pull parser configuration. i would like to play with it to see
how incremental parsing is happening (and how it fits into pipeline) - for
example how next event is obtained? or what needs to be written?

thanks,

alek



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org

Re: help on implementing pull in X2 [was Re: [Xerces2] Pull Parsing]

Posted by Ted Leung <tw...@sauria.com>.

----- Original Message -----
From: "Andy Clark" <an...@apache.org>
To: <xe...@xml.apache.org>
Sent: Friday, August 17, 2001 3:12 AM
Subject: Re: help on implementing pull in X2 [was Re: [Xerces2] Pull
Parsing]


> After looking at the pull-parsing capability in the document
> scanner some more, I realized that it didn't quite work right.
> So I fixed some things and also separated out the scanning of
> the DTD so that when we resolve the issue I raised in a
> recent post, we will be able to support pull-parsing even
> out through the DTD scanner. Neat! :)
>
> But there's still at least one more bug that I still see in
> the pull-parsing document scanner code so I'll try to fix
> that soon. I'll post a response when I've squashed it. But
> the code in CVS now shouldn't cause any regressions as long
> as you use it in the default push mode.
>
> [Q] Are we all in agreement about changing the parse method
>     in the XMLParserConfiguration? If so, I'll make that
>     change over the weekend as well.
+1
> --
> Andy Clark * IBM, TRL - Japan * andyc@apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org

Re: help on implementing pull in X2 [was Re: [Xerces2] Pull Parsing]

Posted by Andy Clark <an...@apache.org>.

After looking at the pull-parsing capability in the document
scanner some more, I realized that it didn't quite work right.
So I fixed some things and also separated out the scanning of
the DTD so that when we resolve the issue I raised in a
recent post, we will be able to support pull-parsing even
out through the DTD scanner. Neat! :)

But there's still at least one more bug that I still see in
the pull-parsing document scanner code so I'll try to fix 
that soon. I'll post a response when I've squashed it. But
the code in CVS now shouldn't cause any regressions as long
as you use it in the default push mode.

[Q] Are we all in agreement about changing the parse method
    in the XMLParserConfiguration? If so, I'll make that
    change over the weekend as well.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org

Re: help on implementing pull in X2 [was Re: [Xerces2] Pull Parsing]

Posted by Andy Clark <an...@apache.org>.

Andy Clark wrote:
> document scanner passes the "complete" parameter to the DTD
> scanner. However, this is a problem because the code will not
> correctly handle pull parsing the DTD (internal or external
> subset) in the case where complete is false. So that code
> will have to be fixed. 

Currently, the scanner can be used as a pull parser but the 
DTD's internal and external subset will be read completely.
I'm working on a fix for to allow the DTD declarations to
also be pull parsed from the document scanner. 

However, it occurs to me that you may want to have separate 
"complete" values for document vs. DTD scanning when you are 
scanning a document. In other words, a pull parser may want 
to stop after each document event *but* read the DTD entirely
without stopping. OR... a pull parser may want to stop after 
each document *and* DTD event.

This all leads me to think that we might need a change to
the XMLDocumentScanner interface. For example:

  - scanDocument(boolean complete):boolean
  + scanDocument(boolean completeDoc, boolean completeDTD):boolean

But, again, this would then cascade up through the config
and the parser instances...

The other option is that the scanning of the DTD's internal
and external subset takes the same "complete" value as the
scanning of the document. And a feature could be used to set 
the "completeDTD" value.

Whatcha think?

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org

Re: help on implementing pull in X2 [was Re: [Xerces2] Pull Parsing]

Posted by Andy Clark <an...@apache.org>.

Aleksander Slominski wrote:
> could somebody point me out where in documentation or source code to look for
> how to implement pull parser configuration. i would like to play with it to see
> how incremental parsing is happening (and how it fits into pipeline) - for
> example how next event is obtained? or what needs to be written?

There's no documentation. However, if you look at the scanner
interfaces that are part of the xni.parser package, then you'll
notice that they each have setInputSource and scanXXX methods
that allow you to parse the document pieces at a time. The
granularity of the information is based on the methods in the
XNI handler interfaces. 

It would be the responsibility of the pull parser implementation 
to receive this information and communicate it in whatever 
fashion it wants. But the breakdown of the data is ultimately 
mandated by what information is available in XNI.

The Xerces2 reference implementation has document and DTD
scanners that are capable of doing pull parsing. However, there
is currently no direct API to call this through the parsers or
parser configurations. (That's the discussion we're having
right now with Ted.) So for now you'd have to write a custom
configuration (you can copy the StandardParserConfiguration)
and add methods to allow pull-parsing.

But I just thought of something... Let me check... Okay, the
document scanner passes the "complete" parameter to the DTD
scanner. However, this is a problem because the code will not
correctly handle pull parsing the DTD (internal or external
subset) in the case where complete is false. So that code 
will have to be fixed. Looking for something to do? ;)

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org