You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Massimo Valla <ma...@gmail.com> on 2006/02/05 03:54:25 UTC

How to read multiple XML from socket: cannot change the protocol

Hi,

I am trying to read multiple XML files from a socket using JAXP 1.3 /
Xerces-J 2.7.1.


Unfortunately the KeepSocketOpen example in Xerces2 Socket Sample (
http://xerces.apache.org/xerces2-j/samples-socket.html) does not work for
me, because I have no control over the other side of the socket.



Also FAQ-11 of Xerces1 (
http://xerces.apache.org/xerces-j/faq-write.html#faq-11) does not help
anymore, because the StreamingCharFactory class used there to prevent
buffering cannot be used in Xerces2 (cannot compile the class).



I have been trying to find a solution to this for a while now, but I could
come to an end.



Can anybody provide a simple example on how to read multiple XML docs from a
socket InputStream?



Thanks a lot,

Massimo

Re: How to read multiple XML from socket: cannot change the protocol

Posted by Massimo Valla <ma...@gmail.com>.
Hi Michael.
Thank you for your reply. I definitely agree on your point. The protocol is
awful. But, unfortunately I cannot change the server side nor the protocol.
I could assume that each document ends when the root tag is closed. So
your example could be parsed and received as two documents:

1st doc:
   <?xml version="1.0" encoding="UTF-8"?>
   <root/>
2nd doc
   <?xml version="1.0" encoding="UTF-8"?>
   <root2/>
leaving out the comment as not beloning to any of the two docs.

The problem is that with Xerces as soon as I receive the first end tag SAX
notification, the parser has already buffered part of the other XML message,
so starting another parse command on the inputstream will not work.

How can I set a simular solution to FAQ-11 (of Xerces1) in Xerces2 ??
More generally, how can I write a client with Xerces that is able to parse
mutiple XML coming from the socket?

(I have also tryed other parsers: they allow char-by-char parsing and they
would not close the inputstream after a parse error, so I would be fine
using them. But I would very much prefer to stay with Xerces as it is the
parser used in Java 1.5...)

Thanks a lot,
Massimo

On 2/12/06, Michael Glavassevich <mr...@ca.ibm.com> wrote:
> Hi Massimo,
>
> The KeepSocketOpen sample works because the server socket tells the client

> how many bytes there are in the document. If the server has no protocol
> for communicating the boundaries between XML documents, how can you tell
> where one begins and another ends?
>
> Consider if your client receives this from the socket:
>
> <root/>
> <!-- comment -->
> <root2/>
>
> How would you know whether you've received two documents or one not
> well-formed document containing multiple root elements? And if this is
> processed as two documents does the comment belong to the first or the
> second? Only the sender could know that.
>
> Thanks.
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> Massimo Valla < massimo.valla@gmail.com> wrote on 02/04/2006 09:54:25 PM:
>
> > Hi,
> > I am trying to read multiple XML files from a socket using JAXP 1.3
> > / Xerces-J 2.7.1.
> >
> > Unfortunately the KeepSocketOpen example in Xerces2 Socket Sample (
> > http://xerces.apache.org/xerces2-j/samples-socket.html) does not
> > work for me, because I have no control over the other side of the
> socket.
> >
> > Also FAQ-11 of Xerces1 ( http://xerces.apache.org/xerces-j/faq-
> > write.html#faq-11) does not help anymore, because the
> > StreamingCharFactory class used there to prevent buffering cannot be
> > used in Xerces2 (cannot compile the class).
> >
> > I have been trying to find a solution to this for a while now, but I
> > could come to an end.
> >
> > Can anybody provide a simple example on how to read multiple XML
> > docs from a socket InputStream?
> >
> > Thanks a lot,
> > Massimo
>