You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Kevin Chong <ke...@fractaltechnologies.com> on 2002/07/31 07:47:49 UTC

Writing lots of elements to file

Hi,
 
I wish to write to a file an element which contains many (as many as
1,000,000) child elements. Using DOM, one would build all these
DOM_Elements before passing to a writer. However, this option is not
viable because it'll take up too much memory.
 
My initial thought was to derive my own element class that overrides the
behaviour of methods like getFirstChild() and getNextSibling() such that
child elements are constructed only on demand. This would solve my
problem since in my writer method, I'm doing something like:
 
            DOM_Node child = toWrite.getFirstChild(); // first child
gets constructed here
            while( child!=0 ) {
                        writeToFile( child, someFileStream );
                        child = child.getNextSibling(); // next child
sibling gets constructed here
                        // and somehow ensure that the child that had
already been written to file is freed
            }
 
However, looking more closely at the implementation of xerces' DOM, none
of the methods of DOM_XXX classes are virtual, and what I first thought
would solve my problem would not work at all.
 
Does anyone have any suggestion that can help solve this problem? I'm
also interested to hear if people know of other DOM implementations that
would allow what I proposed to work.
 
Thanks in advance,
Kevin

RE: Writing lots of elements to file

Posted by Gareth Reakes <ga...@decisionsoft.com>.
Hi,

On Wed, 31 Jul 2002, Kevin Chong wrote:

> > The methods are virtual already. Look at NodeImpl. In 2.0 the memory
> > management has changed so you don't have the reference counted
> wrappers.
> 
> Good. That sounds like what I'm after.

Just to clarify that, the methods are virtual in 1.7. 


> 
> > OK. I would derive a class from ElementNSImpl and override the methods
> as
> > you suggest. You will also have to override the element creation stuff
> in
> > DOMParser. In addition to this you will have to take a look at
> > DOM_Document and (DocumentImpl) and derive from that so you can have a
> > doc.createMySpecialElement(). I don't think I'm forgetting anything (I
> > sure
> > someone will tell me if I am). That should then work.
> 
> Is ElementNSImpl public? If it isn't, then I'd need to introduce my
> derived class inside the xerces library. Like I mentioned before, I'd
> prefer not to make any modification to the library, if that's the only
> way, then I've got no other choice.


No it is not public, however you dont need to introduce your code into 
into the library, you just have to include a file that is not in the 
standard include directory. This will mean that you need then source, but 
you will not have to make modifcations to it.


Gareth


> 
> Thanks again.
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> 
> 

-- 
Gareth Reakes, Head of Product Development  
DecisionSoft Ltd.            http://www.decisionsoft.com
Office: +44 (0) 1865 203192



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: Writing lots of elements to file

Posted by Kevin Chong <ke...@fractaltechnologies.com>.
> The methods are virtual already. Look at NodeImpl. In 2.0 the memory
> management has changed so you don't have the reference counted
wrappers.

Good. That sounds like what I'm after.

> OK. I would derive a class from ElementNSImpl and override the methods
as
> you suggest. You will also have to override the element creation stuff
in
> DOMParser. In addition to this you will have to take a look at
> DOM_Document and (DocumentImpl) and derive from that so you can have a
> doc.createMySpecialElement(). I don't think I'm forgetting anything (I
> sure
> someone will tell me if I am). That should then work.

Is ElementNSImpl public? If it isn't, then I'd need to introduce my
derived class inside the xerces library. Like I mentioned before, I'd
prefer not to make any modification to the library, if that's the only
way, then I've got no other choice.

Thanks again.



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: Writing lots of elements to file

Posted by Gareth Reakes <ga...@decisionsoft.com>.
On Wed, 31 Jul 2002, Kevin Chong wrote:

> Thanks for the reply,

No problem.

> 
> I haven't checked v2.0 out yet, are the methods now virtual?

The methods are virtual already. Look at NodeImpl. In 2.0 the memory 
management has changed so you don't have the reference counted wrappers.

> 
> I did think about modifying the behaviour of ElementImpl, but that would
> require me to tweak xerces' source, I was hoping to be able to use
> xerces straight out of the box and make any modifications only in my
> code. In addition, that won't give me the ability to choose at run time
> which behaviour I want, we already are using DOM in our code and we
> don't want to change the behaviour in those places.

OK. I would derive a class from ElementNSImpl and override the methods as 
you suggest. You will also have to override the element creation stuff in 
DOMParser. In addition to this you will have to take a look at 
DOM_Document and (DocumentImpl) and derive from that so you can have a  
doc.createMySpecialElement(). I don't think I'm forgetting anything (I sure 
someone will tell me if I am). That should then work.

> 
> A couple of reasons why I'm intending to use DOM, one reason is we
> already have a writer for DOM. Another reason is in our application, we
> probably might not be able to build the XML representation in a
> sequential order.
> 
> Kevin.
>  
> > -----Original Message-----
> > From: Gareth Reakes [mailto:gareth@decisionsoft.com]
> > Sent: Wednesday, 31 July 2002 4:06 PM
> > To: xerces-c-dev@xml.apache.org
> > Subject: Re: Writing lots of elements to file
> > 
> > Hi,
> > 	from the look of your code you are using 1.7. The way things
> work
> > has changed considerably for 2.0 so you may want to do your work based
> on
> > that so you can benifit from future upgrades.
> > 	The DOM_XXX are just references counted wrappers arround the
> > impls. You want to be looking at NodeImpl, ElementImpl etc. You could
> do
> > what you suggest, however then you would have to make the parser
> create
> > your type of element. You might just be able to replace the code in
> > ElementImpl and recompile.
> > 	Do you have to use DOM? Why not use SAX, it sounds like you
> should
> > be.
> > 
> > 
> > Gareth
> > 
> > 
> > 
> > On Wed, 31 Jul 2002, Kevin Chong wrote:
> > 
> > > Hi,
> > >
> > > I wish to write to a file an element which contains many (as many as
> > > 1,000,000) child elements. Using DOM, one would build all these
> > > DOM_Elements before passing to a writer. However, this option is not
> > > viable because it'll take up too much memory.
> > >
> > > My initial thought was to derive my own element class that overrides
> the
> > > behaviour of methods like getFirstChild() and getNextSibling() such
> that
> > > child elements are constructed only on demand. This would solve my
> > > problem since in my writer method, I'm doing something like:
> > >
> > >             DOM_Node child = toWrite.getFirstChild(); // first child
> > > gets constructed here
> > >             while( child!=0 ) {
> > >                         writeToFile( child, someFileStream );
> > >                         child = child.getNextSibling(); // next
> child
> > > sibling gets constructed here
> > >                         // and somehow ensure that the child that
> had
> > > already been written to file is freed
> > >             }
> > >
> > > However, looking more closely at the implementation of xerces' DOM,
> none
> > > of the methods of DOM_XXX classes are virtual, and what I first
> thought
> > > would solve my problem would not work at all.
> > >
> > > Does anyone have any suggestion that can help solve this problem?
> I'm
> > > also interested to hear if people know of other DOM implementations
> that
> > > would allow what I proposed to work.
> > >
> > > Thanks in advance,
> > > Kevin
> > >
> > 
> > --
> > Gareth Reakes, Head of Product Development
> > DecisionSoft Ltd.            http://www.decisionsoft.com
> > Office: +44 (0) 1865 203192
> > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> 
> 

-- 
Gareth Reakes, Head of Product Development  
DecisionSoft Ltd.            http://www.decisionsoft.com
Office: +44 (0) 1865 203192



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: Writing lots of elements to file

Posted by Kevin Chong <ke...@fractaltechnologies.com>.
Thanks for the reply,

I haven't checked v2.0 out yet, are the methods now virtual?

I did think about modifying the behaviour of ElementImpl, but that would
require me to tweak xerces' source, I was hoping to be able to use
xerces straight out of the box and make any modifications only in my
code. In addition, that won't give me the ability to choose at run time
which behaviour I want, we already are using DOM in our code and we
don't want to change the behaviour in those places.

A couple of reasons why I'm intending to use DOM, one reason is we
already have a writer for DOM. Another reason is in our application, we
probably might not be able to build the XML representation in a
sequential order.

Kevin.
 
> -----Original Message-----
> From: Gareth Reakes [mailto:gareth@decisionsoft.com]
> Sent: Wednesday, 31 July 2002 4:06 PM
> To: xerces-c-dev@xml.apache.org
> Subject: Re: Writing lots of elements to file
> 
> Hi,
> 	from the look of your code you are using 1.7. The way things
work
> has changed considerably for 2.0 so you may want to do your work based
on
> that so you can benifit from future upgrades.
> 	The DOM_XXX are just references counted wrappers arround the
> impls. You want to be looking at NodeImpl, ElementImpl etc. You could
do
> what you suggest, however then you would have to make the parser
create
> your type of element. You might just be able to replace the code in
> ElementImpl and recompile.
> 	Do you have to use DOM? Why not use SAX, it sounds like you
should
> be.
> 
> 
> Gareth
> 
> 
> 
> On Wed, 31 Jul 2002, Kevin Chong wrote:
> 
> > Hi,
> >
> > I wish to write to a file an element which contains many (as many as
> > 1,000,000) child elements. Using DOM, one would build all these
> > DOM_Elements before passing to a writer. However, this option is not
> > viable because it'll take up too much memory.
> >
> > My initial thought was to derive my own element class that overrides
the
> > behaviour of methods like getFirstChild() and getNextSibling() such
that
> > child elements are constructed only on demand. This would solve my
> > problem since in my writer method, I'm doing something like:
> >
> >             DOM_Node child = toWrite.getFirstChild(); // first child
> > gets constructed here
> >             while( child!=0 ) {
> >                         writeToFile( child, someFileStream );
> >                         child = child.getNextSibling(); // next
child
> > sibling gets constructed here
> >                         // and somehow ensure that the child that
had
> > already been written to file is freed
> >             }
> >
> > However, looking more closely at the implementation of xerces' DOM,
none
> > of the methods of DOM_XXX classes are virtual, and what I first
thought
> > would solve my problem would not work at all.
> >
> > Does anyone have any suggestion that can help solve this problem?
I'm
> > also interested to hear if people know of other DOM implementations
that
> > would allow what I proposed to work.
> >
> > Thanks in advance,
> > Kevin
> >
> 
> --
> Gareth Reakes, Head of Product Development
> DecisionSoft Ltd.            http://www.decisionsoft.com
> Office: +44 (0) 1865 203192
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Writing lots of elements to file

Posted by Gareth Reakes <ga...@decisionsoft.com>.
Hi,
	from the look of your code you are using 1.7. The way things work 
has changed considerably for 2.0 so you may want to do your work based on 
that so you can benifit from future upgrades. 
	The DOM_XXX are just references counted wrappers arround the 
impls. You want to be looking at NodeImpl, ElementImpl etc. You could do 
what you suggest, however then you would have to make the parser create 
your type of element. You might just be able to replace the code in 
ElementImpl and recompile.
	Do you have to use DOM? Why not use SAX, it sounds like you should 
be.


Gareth



On Wed, 31 Jul 2002, Kevin Chong wrote:

> Hi,
>  
> I wish to write to a file an element which contains many (as many as
> 1,000,000) child elements. Using DOM, one would build all these
> DOM_Elements before passing to a writer. However, this option is not
> viable because it'll take up too much memory.
>  
> My initial thought was to derive my own element class that overrides the
> behaviour of methods like getFirstChild() and getNextSibling() such that
> child elements are constructed only on demand. This would solve my
> problem since in my writer method, I'm doing something like:
>  
>             DOM_Node child = toWrite.getFirstChild(); // first child
> gets constructed here
>             while( child!=0 ) {
>                         writeToFile( child, someFileStream );
>                         child = child.getNextSibling(); // next child
> sibling gets constructed here
>                         // and somehow ensure that the child that had
> already been written to file is freed
>             }
>  
> However, looking more closely at the implementation of xerces' DOM, none
> of the methods of DOM_XXX classes are virtual, and what I first thought
> would solve my problem would not work at all.
>  
> Does anyone have any suggestion that can help solve this problem? I'm
> also interested to hear if people know of other DOM implementations that
> would allow what I proposed to work.
>  
> Thanks in advance,
> Kevin
> 

-- 
Gareth Reakes, Head of Product Development  
DecisionSoft Ltd.            http://www.decisionsoft.com
Office: +44 (0) 1865 203192



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org