You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xml.apache.org by Andy Clark <an...@apache.org> on 2000/12/04 05:37:51 UTC

[Sample] Parsing XML Documents on Socket

We have another posting about how to read XML documents on a
socket connection. Instead of answering the question, I've
done one better by writing a new sample that shows you how
to read XML documents from a socket connection!

The solution to reading an XML document on a socket is to wrap 
the input and output with a protocol. This enables the parser 
to parse a document on the socket stream and detect the end of 
the document and (even more important) not close the socket 
connection by closing the input stream.

My sample code works regardless of the encoding of the XML
document and is general enough that it can be used to handle
any variable length data being transferred on a socket. This
is basically how it works: the server "wraps" the output 
stream when sending the XML document and the client "unwraps" 
the input stream when receiving the XML document. I'll detail 
exactly how this works below but what's important is that it 
works transparently to the server and client as long as they 
use the appropriate "wrapper" classes for the input/output 
streams.

The wrapper input/output streams introduce a "packet" kind 
of protocol onto the stream. Therefore, when the server
writes data to the output stream, the wrapper class breaks
the input into a series of packets. These packets contain
a simple header that just states how many bytes are in the
packet (not including the header), followed by the packet
data. The receiving input stream knows how to read the
header and return only the bytes in the packet data to the
calling client code.

These input/output classes provide a general mechanism for
sending variable length data on a socket connection. It
acts as if there is a localized input/output stream within
the socket stream. And the wrapper classes can be used
independently from the socket sample. There are a few
caveats, though: 1) the server code MUST close the wrapper
output stream; 2) the client code MUST close the wrapper
input stream. The second requirement is only needed if you
detect a parse error and need to skip to the end of the
wrapper input stream to continue processing the next
piece of information.

I added the sample to the Xerces2 codebase with the
assumption that we'll be moving that code over soon and it
can find a permanent home. If you would like to check it
out now, here's how you do it (this will only checkout the
socket samples dir from CVS):

  set CVSROOT=:pserver:anoncvs@xml.apache.org:/home/cvspublic
  cvs login        (password: anoncvs)
  cvs checkout -d socket -r xerces_j_2 xml-xerces/java/samples/socket

This will create a directory called "socket" which contains
the sample "socket.KeepSocketOpen". You can read the javadoc
for information about how to use this sample. Or, you can
just use the wrapper streams independently. They're in the
"socket.io" package and are called "WrappedOutputStream"
and "WrappedInputStream", respectively.

We'll have to put some explanation in the actual Xerces
documentation so that people know about this sample and
how to solve the XML-on-a-socket problem, though. Any
volunteers?

Let me know if this sample helps.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

Re: [Sample] Parsing XML Documents on Socket

Posted by Son To <so...@gateway.homeip.net>.
On Sun, 3 Dec 2000, Andy Clark wrote:

> We have another posting about how to read XML documents on a
> socket connection. Instead of answering the question, I've
> done one better by writing a new sample that shows you how
> to read XML documents from a socket connection!
> 
> The solution to reading an XML document on a socket is to wrap 
> the input and output with a protocol. This enables the parser 
> to parse a document on the socket stream and detect the end of 
> the document and (even more important) not close the socket 
> connection by closing the input stream.

This does not make sense to me. I'm trying to use XML as a transport
protocol. Why can't I just open a socket and push XML to the daemon?
The daemon will get an inputStream from the client socket, wrap it around
an InputSource and give it to XMLReader.parse(InputSource)?  This works,
but the client has to close its output stream before
XMLReader.parse(InputSource) will start parsing the XML document.


> 
> My sample code works regardless of the encoding of the XML
> document and is general enough that it can be used to handle
> any variable length data being transferred on a socket. This
> is basically how it works: the server "wraps" the output 
> stream when sending the XML document and the client "unwraps" 
> the input stream when receiving the XML document. I'll detail 
> exactly how this works below but what's important is that it 
> works transparently to the server and client as long as they 
> use the appropriate "wrapper" classes for the input/output 
> streams.
> 
> The wrapper input/output streams introduce a "packet" kind 
> of protocol onto the stream. Therefore, when the server
> writes data to the output stream, the wrapper class breaks
> the input into a series of packets. These packets contain
> a simple header that just states how many bytes are in the
> packet (not including the header), followed by the packet
> data. The receiving input stream knows how to read the
> header and return only the bytes in the packet data to the
> calling client code.
> 
> These input/output classes provide a general mechanism for
> sending variable length data on a socket connection. It
> acts as if there is a localized input/output stream within
> the socket stream. And the wrapper classes can be used
> independently from the socket sample. There are a few
> caveats, though: 1) the server code MUST close the wrapper
> output stream; 2) the client code MUST close the wrapper
> input stream. The second requirement is only needed if you
> detect a parse error and need to skip to the end of the
> wrapper input stream to continue processing the next
> piece of information.
> 
> I added the sample to the Xerces2 codebase with the
> assumption that we'll be moving that code over soon and it
> can find a permanent home. If you would like to check it
> out now, here's how you do it (this will only checkout the
> socket samples dir from CVS):
> 
>   set CVSROOT=:pserver:anoncvs@xml.apache.org:/home/cvspublic
>   cvs login        (password: anoncvs)
>   cvs checkout -d socket -r xerces_j_2 xml-xerces/java/samples/socket
> 
> This will create a directory called "socket" which contains
> the sample "socket.KeepSocketOpen". You can read the javadoc
> for information about how to use this sample. Or, you can
> just use the wrapper streams independently. They're in the
> "socket.io" package and are called "WrappedOutputStream"
> and "WrappedInputStream", respectively.
> 
> We'll have to put some explanation in the actual Xerces
> documentation so that people know about this sample and
> how to solve the XML-on-a-socket problem, though. Any
> volunteers?
> 
> Let me know if this sample helps.
> 
> 


Re: [Sample] Parsing XML Documents on Socket

Posted by Son To <so...@gateway.homeip.net>.
On Sun, 3 Dec 2000, Andy Clark wrote:

> We have another posting about how to read XML documents on a
> socket connection. Instead of answering the question, I've
> done one better by writing a new sample that shows you how
> to read XML documents from a socket connection!
> 
> The solution to reading an XML document on a socket is to wrap 
> the input and output with a protocol. This enables the parser 
> to parse a document on the socket stream and detect the end of 
> the document and (even more important) not close the socket 
> connection by closing the input stream.

This does not make sense to me. I'm trying to use XML as a transport
protocol. Why can't I just open a socket and push XML to the daemon?
The daemon will get an inputStream from the client socket, wrap it around
an InputSource and give it to XMLReader.parse(InputSource)?  This works,
but the client has to close its output stream before
XMLReader.parse(InputSource) will start parsing the XML document.


> 
> My sample code works regardless of the encoding of the XML
> document and is general enough that it can be used to handle
> any variable length data being transferred on a socket. This
> is basically how it works: the server "wraps" the output 
> stream when sending the XML document and the client "unwraps" 
> the input stream when receiving the XML document. I'll detail 
> exactly how this works below but what's important is that it 
> works transparently to the server and client as long as they 
> use the appropriate "wrapper" classes for the input/output 
> streams.
> 
> The wrapper input/output streams introduce a "packet" kind 
> of protocol onto the stream. Therefore, when the server
> writes data to the output stream, the wrapper class breaks
> the input into a series of packets. These packets contain
> a simple header that just states how many bytes are in the
> packet (not including the header), followed by the packet
> data. The receiving input stream knows how to read the
> header and return only the bytes in the packet data to the
> calling client code.
> 
> These input/output classes provide a general mechanism for
> sending variable length data on a socket connection. It
> acts as if there is a localized input/output stream within
> the socket stream. And the wrapper classes can be used
> independently from the socket sample. There are a few
> caveats, though: 1) the server code MUST close the wrapper
> output stream; 2) the client code MUST close the wrapper
> input stream. The second requirement is only needed if you
> detect a parse error and need to skip to the end of the
> wrapper input stream to continue processing the next
> piece of information.
> 
> I added the sample to the Xerces2 codebase with the
> assumption that we'll be moving that code over soon and it
> can find a permanent home. If you would like to check it
> out now, here's how you do it (this will only checkout the
> socket samples dir from CVS):
> 
>   set CVSROOT=:pserver:anoncvs@xml.apache.org:/home/cvspublic
>   cvs login        (password: anoncvs)
>   cvs checkout -d socket -r xerces_j_2 xml-xerces/java/samples/socket
> 
> This will create a directory called "socket" which contains
> the sample "socket.KeepSocketOpen". You can read the javadoc
> for information about how to use this sample. Or, you can
> just use the wrapper streams independently. They're in the
> "socket.io" package and are called "WrappedOutputStream"
> and "WrappedInputStream", respectively.
> 
> We'll have to put some explanation in the actual Xerces
> documentation so that people know about this sample and
> how to solve the XML-on-a-socket problem, though. Any
> volunteers?
> 
> Let me know if this sample helps.
> 
>