You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Dan White <yg...@comcast.net> on 2004/09/26 07:09:11 UTC

HowTo Request: Sax Events to DOM Tree

Is there a reasonably simple way to build a DOM Tree with SAX events ?

I am working on a client-server that communicates by XML over a socket. 
  Some of the other developers are having trouble pulling a complete XML 
statement off of the data stream.

I implemented a DOM Parser with a very large input buffer and have had 
no problems (Yet)

It was suggested that a SAX Parser would not have size limitations or 
the problem of finding the end of the statement, but I have already 
written a lot of code to process a DOM tree and would like to be able 
to create one from the SAX Parser.

Thanks in advance.


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: HowTo Request: Sax Events to DOM Tree

Posted by Simon Kitching <si...@ecnetwork.co.nz>.
On Fri, 2004-10-01 at 09:04, Dan White wrote:
> The idea is to take advantade of the SAX and its ability to process a
> stream of non-specific length and combine that with the data organization
> of DOM.

Are you saying that you need to handle an input stream containing
multiple XML documents? I needed to do this a couple of years ago and
asked about doing this in xerces-j. The answer was basically "no, and we
aren't interested in handling this". There was a valid point made that
document processing instructions can occur even after the end of the
document root element, though in practice this would be extremely rare.
As far as I am aware, you can (a) handle startElement/endElement sax
events, keeping track of the element depth and when it reaches zero
somehow fudge the parser's input stream to return EOF or (b) have some
external protocol that allows you to detect boundaries between XML
documents in the input stream (eg insist each is packaged as a MIME
part). I think (a) will work, though if the parser does "read-ahead"
this could cause problems.

Regarding the ability to "parse input of any length": as others have
said, if you are determined to build a DOM model of the input in memory,
then you *must* have enough RAM to hold that model. As others mave
mentioned, the parse methods that build a DOM don't read the whole
buffer into memory before starting parsing; the parsing phase works
exactly the same as the SAX parsing.

The alternative is *not* to build a DOM model at all, but to handle the
SAX events and perform whatever processing you want immediately. But
that isn't what you asked about...

Note also that the DOM model of an xml document is many times larger
than the original file. So if you don't have enough RAM to read the
original file into memory (though this doesn't happen anyway), you won't
have enough memory to hold the DOM model of that document.

Regards,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: HowTo Request: Sax Events to DOM Tree

Posted by da...@us.ibm.com.
> The idea is to take advantade of the SAX and its ability to
> process a stream of non-specific length and combine that with
> the data organization of DOM.

I'm not sure I understand what you mean by this.  Since the same 
underlying scanner technology runs the SAX parsers and the DOM parsers and 
builders, they have the same requirements vis-a-vis the input stream.

> So, I would like to be able to parse an input of any length
> --AND-- have the result end up in a DOM tree for easy manipulation.
> I would like to do this without having to pick apart half of the
> Xerces library, if possible.

You can only do this if your machine has an infinite amount of memory, 
which is clearly impossible.  So, I still don't understand why the the 
DOMBuilder won't work for why and why you think SAX is the cure.

Dave

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: HowTo Request: Sax Events to DOM Tree

Posted by Dan White <yg...@comcast.net>.
The idea is to take advantade of the SAX and its ability to process a
stream of non-specific length and combine that with the data organization
of DOM.

So, I would like to be able to parse an input of any length --AND-- have
the result end up in a DOM tree for easy manipulation.  I would like to do
this without having to pick apart half of the Xerces library, if possible.

On Thu, 30 Sep 2004, Alberto Massari wrote:

> At 11.37 30/09/2004 -0400, Dan White wrote:
> >OK, in reading data from a socket, you either do a "read_n" with a big
> >enough buffer to hold any meggase you might get or you do repeated smaller
> >reads until you arrive at some agreed-upon separation of message traffic.
> >
> >I am currently using a DOM input parser/validator with the first method
> >(big buffer), but it was pointed out to me that this could cause trouble
> >if the message size outgrows the buffer.
> >
> >Contrarywise, SAX processes the stream flow and can handle any size
> >statement.
>
> In Xerces both DOM and SAX parsers use the same XMLReader/XMLScanner
> objects; so, both of them should either work or not work in your
> environment. It's up to the InputSource you are using to properly merge the
> socket traffic. In my opinion, switching to SAX will not help you.
>
> Alberto
>
>
> >My dilemma is that I have written a bunch of code that starts from a DOM
> >tree in memory.  If I switch the input/validation to SAX, my choices are
> >to either rewrite all the rest of the code to accomodate the SAX "process
> >it as you parse it" way of processing or create a DOM tree from the SAX
> >and the proceed as before.
> >
> >I am looking for info to implement the second choice.
> >
> >On Thu, 30 Sep 2004, Alberto Massari wrote:
> >
> > > At 08.03 30/09/2004 -0400, Dan White wrote:
> > > >Bump.
> > > >
> > > >Anyone have a clue on this ?
> > >
> > > I don't have quite understood what is your problem: what is the connection
> > > between "pulling a complete XML statement off the data stream", "a DOM
> > > parser with a large input buffer" and "building a DOM out of a SAX parser"?
> > >
> > > Alberto
> > >
> > >
> > > >On Sep 26, 2004, at 1:09 AM, Dan White wrote:
> > > >
> > > >>Is there a reasonably simple way to build a DOM Tree with SAX events ?
> > > >>
> > > >>I am working on a client-server that communicates by XML over a
> > > >>socket.  Some of the other developers are having trouble pulling a
> > > >>complete XML statement off of the data stream.
> > > >>
> > > >>I implemented a DOM Parser with a very large input buffer and have had no
> > > >>problems (Yet)
> > > >>
> > > >>It was suggested that a SAX Parser would not have size limitations or the
> > > >>problem of finding the end of the statement, but I have already written a
> > > >>lot of code to process a DOM tree and would like to be able to create one
> > > >>from the SAX Parser.
> > > >>
> > > >>Thanks in advance.
> > > >>
> > > >>
> > > >>---------------------------------------------------------------------
> > > >>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > > >>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> > > >
> > > >
> > > >---------------------------------------------------------------------
> > > >To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > > >For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> > >
> > >
> > >
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: HowTo Request: Sax Events to DOM Tree

Posted by Alberto Massari <am...@progress.com>.
At 11.37 30/09/2004 -0400, Dan White wrote:
>OK, in reading data from a socket, you either do a "read_n" with a big
>enough buffer to hold any meggase you might get or you do repeated smaller
>reads until you arrive at some agreed-upon separation of message traffic.
>
>I am currently using a DOM input parser/validator with the first method
>(big buffer), but it was pointed out to me that this could cause trouble
>if the message size outgrows the buffer.
>
>Contrarywise, SAX processes the stream flow and can handle any size
>statement.

In Xerces both DOM and SAX parsers use the same XMLReader/XMLScanner 
objects; so, both of them should either work or not work in your 
environment. It's up to the InputSource you are using to properly merge the 
socket traffic. In my opinion, switching to SAX will not help you.

Alberto


>My dilemma is that I have written a bunch of code that starts from a DOM
>tree in memory.  If I switch the input/validation to SAX, my choices are
>to either rewrite all the rest of the code to accomodate the SAX "process
>it as you parse it" way of processing or create a DOM tree from the SAX
>and the proceed as before.
>
>I am looking for info to implement the second choice.
>
>On Thu, 30 Sep 2004, Alberto Massari wrote:
>
> > At 08.03 30/09/2004 -0400, Dan White wrote:
> > >Bump.
> > >
> > >Anyone have a clue on this ?
> >
> > I don't have quite understood what is your problem: what is the connection
> > between "pulling a complete XML statement off the data stream", "a DOM
> > parser with a large input buffer" and "building a DOM out of a SAX parser"?
> >
> > Alberto
> >
> >
> > >On Sep 26, 2004, at 1:09 AM, Dan White wrote:
> > >
> > >>Is there a reasonably simple way to build a DOM Tree with SAX events ?
> > >>
> > >>I am working on a client-server that communicates by XML over a
> > >>socket.  Some of the other developers are having trouble pulling a
> > >>complete XML statement off of the data stream.
> > >>
> > >>I implemented a DOM Parser with a very large input buffer and have had no
> > >>problems (Yet)
> > >>
> > >>It was suggested that a SAX Parser would not have size limitations or the
> > >>problem of finding the end of the statement, but I have already written a
> > >>lot of code to process a DOM tree and would like to be able to create one
> > >>from the SAX Parser.
> > >>
> > >>Thanks in advance.
> > >>
> > >>
> > >>---------------------------------------------------------------------
> > >>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > >>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> > >
> > >
> > >---------------------------------------------------------------------
> > >To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > >For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> >
> >
> >



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: HowTo Request: Sax Events to DOM Tree

Posted by Alberto Massari <am...@progress.com>.
At 08.03 30/09/2004 -0400, Dan White wrote:
>Bump.
>
>Anyone have a clue on this ?

I don't have quite understood what is your problem: what is the connection 
between "pulling a complete XML statement off the data stream", "a DOM 
parser with a large input buffer" and "building a DOM out of a SAX parser"?

Alberto


>On Sep 26, 2004, at 1:09 AM, Dan White wrote:
>
>>Is there a reasonably simple way to build a DOM Tree with SAX events ?
>>
>>I am working on a client-server that communicates by XML over a 
>>socket.  Some of the other developers are having trouble pulling a 
>>complete XML statement off of the data stream.
>>
>>I implemented a DOM Parser with a very large input buffer and have had no 
>>problems (Yet)
>>
>>It was suggested that a SAX Parser would not have size limitations or the 
>>problem of finding the end of the statement, but I have already written a 
>>lot of code to process a DOM tree and would like to be able to create one 
>>from the SAX Parser.
>>
>>Thanks in advance.
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: HowTo Request: Sax Events to DOM Tree

Posted by Gareth Reakes <ga...@parthenoncomputing.com>.
Hi,

       I have a DOM to SAX, but not a SAX to DOM. Should not be too hard 
to make something using AbstractDOMParser. Would have to take a bit more 
of a look to be sure though.

Gareth

Dan White wrote:

> Bump.
>
> Anyone have a clue on this ?
>
> On Sep 26, 2004, at 1:09 AM, Dan White wrote:
>
>> Is there a reasonably simple way to build a DOM Tree with SAX events ?
>>
>> I am working on a client-server that communicates by XML over a 
>> socket.  Some of the other developers are having trouble pulling a 
>> complete XML statement off of the data stream.
>>
>> I implemented a DOM Parser with a very large input buffer and have 
>> had no problems (Yet)
>>
>> It was suggested that a SAX Parser would not have size limitations or 
>> the problem of finding the end of the statement, but I have already 
>> written a lot of code to process a DOM tree and would like to be able 
>> to create one from the SAX Parser.
>>
>> Thanks in advance.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
-- 
Gareth Reakes, Managing Director      Parthenon Computing
+44-1865-811184                  http://www.parthcomp.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: HowTo Request: Sax Events to DOM Tree

Posted by Dan White <yg...@comcast.net>.
Bump.

Anyone have a clue on this ?

On Sep 26, 2004, at 1:09 AM, Dan White wrote:

> Is there a reasonably simple way to build a DOM Tree with SAX events ?
>
> I am working on a client-server that communicates by XML over a 
> socket.  Some of the other developers are having trouble pulling a 
> complete XML statement off of the data stream.
>
> I implemented a DOM Parser with a very large input buffer and have had 
> no problems (Yet)
>
> It was suggested that a SAX Parser would not have size limitations or 
> the problem of finding the end of the statement, but I have already 
> written a lot of code to process a DOM tree and would like to be able 
> to create one from the SAX Parser.
>
> Thanks in advance.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org