You are viewing a plain text version of this content. The canonical link for it is here.

Posted to j-dev@xerces.apache.org by bu...@apache.org on 2003/06/11 20:20:08 UTC

DO NOT REPLY [Bug 15007] - & to &amp

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=15007>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=15007

& to &amp;amp





------- Additional Comments From mrglavas@apache.org  2003-06-11 18:20 -------
I'm not sure what you want the parser to be doing. 'amp' is a predefined entity 
whose replacement text is '&'. Even if you use a reference to amp in an 
(general) entity declaration, the parser will only expand it when the enclosing 
entity is expanded.

For example, consider these two entity declarations:
<!ENTITY one "&amp;">
<!ENTITY two "&amp;amp;">

When referenced in content, 'one' expands to '&', and 'two' expands to '&amp;'. 
The parser doesn't futher expand this. It is the replacement text.

When the parser serializes a text node, it has to replace each '&' with 
a '&amp;' reference. It can't assume that the occurence of &amp; in a text node 
should be left as is. Once the DOM is constructed from a document, all entity 
and character reference replacement has taken place. So even if you hand craft 
your own document in memory using DOM's factory methods (createTextNode, 
etc...), they'll be interpreted as if all entity and character references have 
already been replaced.

Adding a feature to the serializer which leaves &amp; as &amp; doesn't seem 
reasonable me. The behaviour seems to be non-compliant. If I misinterpreted 
your request, could you please clarify?

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org