You are viewing a plain text version of this content. The canonical link for it is here.

Posted to j-users@xerces.apache.org by "Hellmann Peter (ext)" <Pe...@mch.siemens.de> on 2002/12/13 11:27:30 UTC

RFC822 mail in CDATA section

hi all,

i have to transport rfc822 conform emails with xml, so i want to embed it
into a CDATA section. However mail headers do have greater than and less
than characters and possibly contain the delimiter for the CDATA section ]]>
!? How can i embed a mail without base64 encoding it. Because mails usually
contain attachments that are already base64 encoded, i do not want to encode
the mail again. Is there any tool/function within xerces that lets me check
if some piece of data can be embedded into a CDATA section? Is there a
function that converts < and > to &lt; and &gt; ? 

I use the following code but no escaping occurs:

Node cdata = document.createCDATASection("");
cdata.setNodeValue(the_mail);
parent.appendChild(cdata); 

Any help and/or hints and/or links are highly appreciated. Thank you.

Mit freundlichen Grüßen
kind regards
Peter Hellmann

Siemens Aktiengesellschaft
Information and Communication Networks
Voice & Data Recording
ICN WN CS VDR RD
Hofmannstraße 51              +49 89 722 49347
office 1746.441               +49 89 722 22985
D-81379 München               peter.hellmann@mch.siemens.de
<ma...@mch.siemens.de> 
GERMANY


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org

Re: RFC822 mail in CDATA section

Posted by Joseph Kesselman <ke...@us.ibm.com>.

The answer is that you can't count on using a single CDATA section; you 
have to be prepared to either exit the CDATA section for a moment so part 
or all of the ]]> text is in normal text or splitting it across successive 
CDATA sections.
        <![CDATA[This works: ]]>]]><![CDATA[ -- note that > generally 
doesn't need to be escaped.]]>
        <![CDATA[So does this: ]]]]><![CDATA[> -- or this 
]]]><![CDATA[]>]]>
Obviously your XML generator has to recognize this case and deal with it, 
and when you parse the document again you have to accept that your text 
may be split across multiple text nodes/events.

Note that a similar hassle arises if any of the characters in the data 
can't be expressed directly in the encoding you've chosen for your XML 
file. CDATA sections *only* support that encoding; if you need to do a 
numeric character reference, you need to exit the CDATA section... or 
re-encode all the data in something like base-64, in which case the CDATA 
section really isn't helping you.


The better solution, in most cases, is to just write out text, escaping 
those characters which need to be escaped. CDATA sections are really 
intended as a convenience for humans hand-editing XML files without the 
benefit of XML-aware tools... and as XML knowledge/support becomes 
ubiquitous and people push the boundaries of this trivial workaround, they 
start to become more annoying than useful. Unless you are working with an 
archaic tool that doesn't understand numeric character escapes, I would 
recommend *not* using CDATA sections. As silver bullets go, they're very 
tarnished and rather alloyed.

______________________________________
Joe Kesselman  / IBM Research


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org