You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Bruce Bailey <bj...@plaza.ds.adp.com> on 2001/01/30 22:55:06 UTC

Problem parsing from stdin (more info)

Hi

I am still struggling with a sax2 problem using stdin to read my XML
document.  I know quite a bit more now about what the problem is.

I have created a server program that accesses legacy data.  The server
receives an XML-formatted request, processes that request, and then returns
an XML-formatted reply.  In the program, I start the parser, using
'parser->parse(StdInInputSource());'.  So far so good.  To test it, I invoke
my server program from the UNIX command line, using file redirection for the
input and output, "server < request.xml > reply.xml.  This works just as I
hoped it would.

Next, I configured inetd to launch my server program.  This is a good way to
create a server since inetd automatically maps the socket connection to
stdin and stdout, amongst other things.  In theory, all I have to do is
connect the parser to stdin (already done) and send my reply to stdout.  The
server program processes a single document and terminates.  I then don't
have to worry about managing a long-running process, listening for a socket
connection, etc.  Simple.

There is a problem with this scheme.  At the lowest level, the UNIX versions
of the Xerces parser use fread.  Calls to fread only return when enough
bytes have been read to fill the supplied buffer or it reaches end-of-file.
Since the fread is really reading from a socket, there is no end-of-file.
Therefore, the fread simply hangs, waiting for a condition that will not
occur.  I presume that all UNIX versions will behave the same way.

The only solutions that I can think of would require modification to
Tru64PlatformUtils.cpp -- something I really want to avoid.

Does anyone have any suggestions as to how I can make this scheme work,
preferably ones that don't involve changing the Xerces library?  Are there
reasonable alternatives?

Sorry about the long explanation, but I wanted to include enough information
so that my problem would be understood.

Thanks,

Bruce


> Bruce Bailey
> ADP Dealer Services Group
> Suite 450
> 2525 SW First Avenue
> Portland, OR 97201
> 
> 503 294-4206
> bjb@NospaMplaza.ds.adp.com
> 

Re: Problem parsing from stdin (more info)

Posted by Bob Kline <bk...@rksystems.com>.
On Tue, 30 Jan 2001, Bruce Bailey wrote:

> There is a problem with this scheme.  At the lowest level, the UNIX
> versions of the Xerces parser use fread.  Calls to fread only return
> when enough bytes have been read to fill the supplied buffer or it
> reaches end-of-file. Since the fread is really reading from a
> socket, there is no end-of-file. Therefore, the fread simply hangs,
> waiting for a condition that will not occur.  I presume that all
> UNIX versions will behave the same way.
> ....
> Does anyone have any suggestions as to how I can make this scheme
> work, preferably ones that don't involve changing the Xerces
> library?  Are there reasonable alternatives?

The solution we use is:

 1. Send 4-byte integer (network order) with number of bytes in doc.
 2. Send the document.
 3. Client reads the size prefix and allocates a buffer.
 4. Client reads the document into the buffer.
 5. Client creates a MemBufInputSource which it passes to the parser.

That way your fread knows exactly how many bytes to expect.  If such a
read hangs, you've got a bug somewhere.

-- 
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com