You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Michael Glavassevich <mr...@ca.ibm.com> on 2006/04/16 18:17:58 UTC
Re: best approach to whole document cloning in Xerces2?
Hi Jake,
The behaviour of Document.cloneNode(true) [1] is implementation dependent.
In Xerces it will create a new Document and then import the children from
the original document. I would be really surprised if reparsing the
document performed better than an in-memory copy (unless you had a
UserDataHandler [2] registered which does some heavy operation in response
to the cloning/importing).
[1]
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-3A0ED0A4
[2]
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#UserDataHandler
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
Jacob Kjome <ho...@visi.com> wrote on 04/16/2006 02:17:10 AM:
>
> I'm wondering what's the best approach to cloning an entire
> document? Would it be better to keep a master copy in memory and
> then create a new document and import the other document in there, or
> would it be better to simply reparse the document every time (where
> the document is used over and over again as a template, a copy is
> created and manipulated on each HTTP request, then serialized to the
> browser)? If I keep the document in memory and know I am dealing
> with the Xerces2 implementation, can I just call cloneNode(true) and
> get an identical copy of the whole document, including doctype,
> entities, entity references, etc...? Again, would this be more
> efficient than reparsing the document each time with, say, the
> Xerces2 DOMParser? Is there a clear-cut answer to this, or does it
> depend on document size or other aspect of the document or environment?
>
> thanks,
>
> Jake
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
> For additional commands, e-mail: general-help@xml.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org
Re: best approach to whole document cloning in Xerces2?
Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Jacob Kjome <ho...@visi.com> wrote on 04/17/2006 09:17:53 AM:
> At 11:17 AM 4/16/2006, you wrote:
> >Hi Jake,
> >
> >The behaviour of Document.cloneNode(true) [1] is implementation
dependent.
> >In Xerces it will create a new Document and then import the children
from
> >the original document.
>
> Which would leave out the DTD, I suppose.
I believe it does copy DocumentType nodes, though there's no guarantee
that other DOM implementations will do that.
> So, it would make more
> sense to create my own document and do something like this, right?...
>
> DOMImplementation domImpl = document.getImplementation();
> String documentElement = document.getDoctype().getName();
> DocumentType docType =
> domImpl.createDocumentType(documentElement,
> document.getDoctype().getPublicId(),
document.getDoctype().getSystemId());
> Document doc = domImpl.createDocument("",
> documentElement, docType);
> Node node = doc.importNode(document.getDocumentElement(),
true);
> doc.replaceChild(node, doc.getDocumentElement());
>
> This is what I do currently to get a copy of the template DOM at
> runtime, but I just want to make sure I'm doing it the most correct
> and efficient way possible.
>
> Of course, this leaves out any internal subset and entity nodes,
> no?
Right.
> How would I clone it all?
The implementation of Xerces' Document.cloneNode() should be able to do
that.
> Is it possible via the DOM interfaces?
You cannot import DocumentType nodes using the DOM API (
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Core-Document-importNode).
> > I would be really surprised if reparsing the
> >document performed better than an in-memory copy (unless you had a
> >UserDataHandler [2] registered which does some heavy operation in
response
> >to the cloning/importing).
> >
>
> I kind of figured this, but I just wanted to make sure that the
> caching of template DOM's that I'm doing makes sense.
>
> Jake
>
> >[1]
> >http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.
> html#ID-3A0ED0A4
> >[2]
> >http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#User
> >DataHandler
> >
> >Michael Glavassevich
> >XML Parser Development
> >IBM Toronto Lab
> >E-mail: mrglavas@ca.ibm.com
> >E-mail: mrglavas@apache.org
> >
> >Jacob Kjome <ho...@visi.com> wrote on 04/16/2006 02:17:10 AM:
> >
> >>
> >> I'm wondering what's the best approach to cloning an entire
> >> document? Would it be better to keep a master copy in memory and
> >> then create a new document and import the other document in there,
or
> >> would it be better to simply reparse the document every time (where
> >> the document is used over and over again as a template, a copy is
> >> created and manipulated on each HTTP request, then serialized to the
> >> browser)? If I keep the document in memory and know I am dealing
> >> with the Xerces2 implementation, can I just call cloneNode(true) and
> >> get an identical copy of the whole document, including doctype,
> >> entities, entity references, etc...? Again, would this be more
> >> efficient than reparsing the document each time with, say, the
> >> Xerces2 DOMParser? Is there a clear-cut answer to this, or does it
> >> depend on document size or other aspect of the document or
environment?
> >>
> >> thanks,
> >>
> >> Jake
> >>
> >>
> >>
---------------------------------------------------------------------
> >> To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
> >> For additional commands, e-mail: general-help@xml.apache.org
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
> >For additional commands, e-mail: general-help@xml.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
> For additional commands, e-mail: general-help@xml.apache.org
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org
Re: best approach to whole document cloning in Xerces2?
Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Jacob Kjome <ho...@visi.com> wrote on 04/17/2006 09:17:53 AM:
> At 11:17 AM 4/16/2006, you wrote:
> >Hi Jake,
> >
> >The behaviour of Document.cloneNode(true) [1] is implementation
dependent.
> >In Xerces it will create a new Document and then import the children
from
> >the original document.
>
> Which would leave out the DTD, I suppose.
I believe it does copy DocumentType nodes, though there's no guarantee
that other DOM implementations will do that.
> So, it would make more
> sense to create my own document and do something like this, right?...
>
> DOMImplementation domImpl = document.getImplementation();
> String documentElement = document.getDoctype().getName();
> DocumentType docType =
> domImpl.createDocumentType(documentElement,
> document.getDoctype().getPublicId(),
document.getDoctype().getSystemId());
> Document doc = domImpl.createDocument("",
> documentElement, docType);
> Node node = doc.importNode(document.getDocumentElement(),
true);
> doc.replaceChild(node, doc.getDocumentElement());
>
> This is what I do currently to get a copy of the template DOM at
> runtime, but I just want to make sure I'm doing it the most correct
> and efficient way possible.
>
> Of course, this leaves out any internal subset and entity nodes,
> no?
Right.
> How would I clone it all?
The implementation of Xerces' Document.cloneNode() should be able to do
that.
> Is it possible via the DOM interfaces?
You cannot import DocumentType nodes using the DOM API (
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Core-Document-importNode).
> > I would be really surprised if reparsing the
> >document performed better than an in-memory copy (unless you had a
> >UserDataHandler [2] registered which does some heavy operation in
response
> >to the cloning/importing).
> >
>
> I kind of figured this, but I just wanted to make sure that the
> caching of template DOM's that I'm doing makes sense.
>
> Jake
>
> >[1]
> >http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.
> html#ID-3A0ED0A4
> >[2]
> >http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#User
> >DataHandler
> >
> >Michael Glavassevich
> >XML Parser Development
> >IBM Toronto Lab
> >E-mail: mrglavas@ca.ibm.com
> >E-mail: mrglavas@apache.org
> >
> >Jacob Kjome <ho...@visi.com> wrote on 04/16/2006 02:17:10 AM:
> >
> >>
> >> I'm wondering what's the best approach to cloning an entire
> >> document? Would it be better to keep a master copy in memory and
> >> then create a new document and import the other document in there,
or
> >> would it be better to simply reparse the document every time (where
> >> the document is used over and over again as a template, a copy is
> >> created and manipulated on each HTTP request, then serialized to the
> >> browser)? If I keep the document in memory and know I am dealing
> >> with the Xerces2 implementation, can I just call cloneNode(true) and
> >> get an identical copy of the whole document, including doctype,
> >> entities, entity references, etc...? Again, would this be more
> >> efficient than reparsing the document each time with, say, the
> >> Xerces2 DOMParser? Is there a clear-cut answer to this, or does it
> >> depend on document size or other aspect of the document or
environment?
> >>
> >> thanks,
> >>
> >> Jake
> >>
> >>
> >>
---------------------------------------------------------------------
> >> To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
> >> For additional commands, e-mail: general-help@xml.apache.org
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
> >For additional commands, e-mail: general-help@xml.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
> For additional commands, e-mail: general-help@xml.apache.org
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org
Re: best approach to whole document cloning in Xerces2?
Posted by Jacob Kjome <ho...@visi.com>.
At 11:17 AM 4/16/2006, you wrote:
>Hi Jake,
>
>The behaviour of Document.cloneNode(true) [1] is implementation dependent.
>In Xerces it will create a new Document and then import the children from
>the original document.
Which would leave out the DTD, I suppose. So, it would make more
sense to create my own document and do something like this, right?...
DOMImplementation domImpl = document.getImplementation();
String documentElement = document.getDoctype().getName();
DocumentType docType =
domImpl.createDocumentType(documentElement,
document.getDoctype().getPublicId(), document.getDoctype().getSystemId());
Document doc = domImpl.createDocument("",
documentElement, docType);
Node node = doc.importNode(document.getDocumentElement(), true);
doc.replaceChild(node, doc.getDocumentElement());
This is what I do currently to get a copy of the template DOM at
runtime, but I just want to make sure I'm doing it the most correct
and efficient way possible.
Of course, this leaves out any internal subset and entity nodes,
no? How would I clone it all? Is it possible via the DOM interfaces?
> I would be really surprised if reparsing the
>document performed better than an in-memory copy (unless you had a
>UserDataHandler [2] registered which does some heavy operation in response
>to the cloning/importing).
>
I kind of figured this, but I just wanted to make sure that the
caching of template DOM's that I'm doing makes sense.
Jake
>[1]
>http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-3A0ED0A4
>[2]
>http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#User
>DataHandler
>
>Michael Glavassevich
>XML Parser Development
>IBM Toronto Lab
>E-mail: mrglavas@ca.ibm.com
>E-mail: mrglavas@apache.org
>
>Jacob Kjome <ho...@visi.com> wrote on 04/16/2006 02:17:10 AM:
>
>>
>> I'm wondering what's the best approach to cloning an entire
>> document? Would it be better to keep a master copy in memory and
>> then create a new document and import the other document in there, or
>> would it be better to simply reparse the document every time (where
>> the document is used over and over again as a template, a copy is
>> created and manipulated on each HTTP request, then serialized to the
>> browser)? If I keep the document in memory and know I am dealing
>> with the Xerces2 implementation, can I just call cloneNode(true) and
>> get an identical copy of the whole document, including doctype,
>> entities, entity references, etc...? Again, would this be more
>> efficient than reparsing the document each time with, say, the
>> Xerces2 DOMParser? Is there a clear-cut answer to this, or does it
>> depend on document size or other aspect of the document or environment?
>>
>> thanks,
>>
>> Jake
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
>> For additional commands, e-mail: general-help@xml.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
>For additional commands, e-mail: general-help@xml.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org