You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Jacob Kjome <ho...@visi.com> on 2006/08/23 07:31:40 UTC

internal subset lost after using cloneNode

I think I'm seeing an issue in Xerces2 Java that was previously 
reported and fixed in XercesC++.

internal subset lost after using cloneNode
http://issues.apache.org/jira/browse/XERCESC-1170

Is this a known problem in Xerces2 Java?  The internal subset exists 
perfectly intact until I call Document.cloneNode(true).  When I 
perform a print of the nodes, here's what the document type looks 
like, first before the clone (expected) and then after (actual)....

Expected....

         DocumentTypeImpl: name=document
          internalSubset=
    <!ENTITY erh "Elliotte Rusty Harold">
    <!ELEMENT document (title, signature)>
    <!ELEMENT title (#PCDATA)>
    <!ELEMENT copyright (#PCDATA)>
    <!ELEMENT email (#PCDATA)>
    <!ELEMENT hr EMPTY>
    <!ELEMENT lastmodified (#PCDATA)>
    <!ELEMENT signature (hr, copyright, email, lastmodified)>

Actual....

         DocumentTypeImpl: name=document
             EntityImpl: name=erh
                 TextImpl: Elliotte Rusty Harold


As you can see, Document.cloneNode(true) seems to turn the internal 
subset <!ENTITY> into an actual Entity Node and the rest of the 
internal subset (the <!ELEMENT>'s) is discarded.

Any ideas about this one?

Jake


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: internal subset lost after using cloneNode (patch provided!)

Posted by Jacob Kjome <ho...@visi.com>.
At 01:34 PM 8/28/2006, you wrote:
 >Jacob Kjome <ho...@visi.com> wrote on 08/24/2006 01:59:14 AM:
 >
 >>
 >> Sure enough, I looked at CoreDocumentImpl and importNode(Node,
 >> boolean, boolean, HashTable), where it deals with DOCUMENT_TYPE_NODE
 >> in the switch statement, and it doesn't copy the internal subset
 >> string from the srcdoctype to the newdoctype.  Adding the following
 >> line solves the issue...
 >>
 >> newdoctype.setInternalSubset(srcdoctype.getInternalSubset());
 >>
 >> I've reported the bug and attached a very simple patch.  I hope it
 >> can be applied for the next release!
 >
 >Thanks Jake. I've committed your patch to SVN.
 >
 >> http://issues.apache.org/jira/browse/XERCESJ-1181
 >>

Thanks Michael!

 >>
 >> I'm still curious about the Entity node that gets created as a child
 >> of the DocumentType.  Is that supposed to get created?  Is it necessary?
 >
 >Yes and yes. The entity map contains an Entity node for each of the
 >general entities declared in the DTD. The one Entity in the source
 >document is being copied into the cloned document.
 >

Ahhh...  Ok, Xerces1 didn't do that I don't think.

Jake

 >
 >Michael Glavassevich
 >XML Parser Development
 >IBM Toronto Lab
 >E-mail: mrglavas@ca.ibm.com
 >E-mail: mrglavas@apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: internal subset lost after using cloneNode (patch provided!)

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Jacob Kjome <ho...@visi.com> wrote on 08/24/2006 01:59:14 AM:

> 
> Sure enough, I looked at CoreDocumentImpl and importNode(Node, 
> boolean, boolean, HashTable), where it deals with DOCUMENT_TYPE_NODE 
> in the switch statement, and it doesn't copy the internal subset 
> string from the srcdoctype to the newdoctype.  Adding the following 
> line solves the issue...
> 
> newdoctype.setInternalSubset(srcdoctype.getInternalSubset());
> 
> I've reported the bug and attached a very simple patch.  I hope it 
> can be applied for the next release!

Thanks Jake. I've committed your patch to SVN.

> http://issues.apache.org/jira/browse/XERCESJ-1181
> 
> 
> I'm still curious about the Entity node that gets created as a child 
> of the DocumentType.  Is that supposed to get created?  Is it necessary?

Yes and yes. The entity map contains an Entity node for each of the 
general entities declared in the DTD. The one Entity in the source 
document is being copied into the cloned document.

> Jake

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: internal subset lost after using cloneNode (patch provided!)

Posted by Jacob Kjome <ho...@visi.com>.
Sure enough, I looked at CoreDocumentImpl and importNode(Node, 
boolean, boolean, HashTable), where it deals with DOCUMENT_TYPE_NODE 
in the switch statement, and it doesn't copy the internal subset 
string from the srcdoctype to the newdoctype.  Adding the following 
line solves the issue...

                 newdoctype.setInternalSubset(srcdoctype.getInternalSubset());

I've reported the bug and attached a very simple patch.  I hope it 
can be applied for the next release!

http://issues.apache.org/jira/browse/XERCESJ-1181


I'm still curious about the Entity node that gets created as a child 
of the DocumentType.  Is that supposed to get created?  Is it necessary?

Jake

At 12:31 AM 8/23/2006, you wrote:
 >
 >I think I'm seeing an issue in Xerces2 Java that was previously
 >reported and fixed in XercesC++.
 >
 >internal subset lost after using cloneNode
 >http://issues.apache.org/jira/browse/XERCESC-1170
 >
 >Is this a known problem in Xerces2 Java?  The internal subset exists
 >perfectly intact until I call Document.cloneNode(true).  When I
 >perform a print of the nodes, here's what the document type looks
 >like, first before the clone (expected) and then after (actual)....
 >
 >Expected....
 >
 >         DocumentTypeImpl: name=document
 >          internalSubset=
 >    <!ENTITY erh "Elliotte Rusty Harold">
 >    <!ELEMENT document (title, signature)>
 >    <!ELEMENT title (#PCDATA)>
 >    <!ELEMENT copyright (#PCDATA)>
 >    <!ELEMENT email (#PCDATA)>
 >    <!ELEMENT hr EMPTY>
 >    <!ELEMENT lastmodified (#PCDATA)>
 >    <!ELEMENT signature (hr, copyright, email, lastmodified)>
 >
 >Actual....
 >
 >         DocumentTypeImpl: name=document
 >             EntityImpl: name=erh
 >                 TextImpl: Elliotte Rusty Harold
 >
 >
 >As you can see, Document.cloneNode(true) seems to turn the internal
 >subset <!ENTITY> into an actual Entity Node and the rest of the
 >internal subset (the <!ELEMENT>'s) is discarded.
 >
 >Any ideas about this one?
 >
 >Jake
 >
 >
 >---------------------------------------------------------------------
 >To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
 >For additional commands, e-mail: j-users-help@xerces.apache.org
 >
 >
 > 


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org