You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Tim McCune <ti...@channelpoint.com> on 2001/01/16 08:23:59 UTC

SAX Buffering

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm trying to use the SAX parser to parse XML that's streaming in
over a socket.  However, no SAX events are firing.  The parsing
thread is hanging on ChunkyByteArray.java line 235, which is calling
read() on the socket's InputStream.  It appears that Xerces is
waiting until its buffer is full or until the end of the stream is
reached, before it fires any SAX events.  I need it to start firing
SAX events as soon as possible (e.g. when an element start tag is
completely read.)  Does anyone have any suggestions on how I might go
about this?  The simple brute force solution seemed to be to turn off
the buffering.  I set CHUNK_SHIFT to 0 in ChunkyByteArray and in
UTF8DataChunk, but that's giving me an ArrayIndexOutOfBoundsException
in ChunkyByteArray's read method.  Any suggestions or guidance would
be appreciated.  Thanks.
 
Tim McCune
Software Architect, ChannelPoint
"People prefer to fail conservatively than to risk succeeding
differently; prefer to invent than to research, and find it hard to
change their habits." -- Alistair Cockburn
 

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 6.5.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBOmP2ztUPOr8a7vy5EQK8pQCgwOM3ZGPNF09bqApUWsd7UIy5iJwAoKLY
UKNrvJwovOjAbNLuze7aRQ9D
=3Wl4
-----END PGP SIGNATURE-----

Re: SAX Buffering

Posted by Andy Clark <an...@apache.org>.
Tim McCune wrote:
> I'm trying to use the SAX parser to parse XML that's streaming in
> over a socket.  However, no SAX events are firing.  The parsing
> thread is hanging on ChunkyByteArray.java line 235, which is calling
> read() on the socket's InputStream.  It appears that Xerces is
> waiting until its buffer is full or until the end of the stream is

Your assessment is 100% correct but, unfortunately, it's a real
pain to hack Xerces 1.x to not block because of the socket stream 
still being open. There *is* a solution and some people have done 
it -- I'm sure one of them can post their patch for you.

The good news is that Xerces2 does not have this problem. But 
there is another solution to your problem that will work with 
Xerces 1.x without modifying the parser code to create a custom
parser class.

There is a sample for Xerces2 called "socket.KeepSocketOpen" that
wraps the output of the XML document on the server using a special
output stream and uses a special un-wrapper on the client side.
Look at the sample code for an example of how to use these wrapper
stream classes.

The sample was written to solve another problem -- mainly that
socket streams usually stay open but there's no unique marker to
signal the end of the document to the parser -- but it can also
be used to solve your problem.

The sample was included in the Xerces 2.0.0 (alpha) release so
you can pick it up from there. Or you can extract it from CVS.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org