You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Michael Goldberg <MG...@yet2.com> on 2001/01/18 00:59:43 UTC

Serializing Trademark symbol (TM) results in unexpected output

All,

I have the following XHTML span tag in an input file: <span
style="font-family:Symbol;">&#212;</span>.

I read in the input file and parse using various DOM methods.  At one point,
I use the XMLSerializer class to write the above node to a file.
Unfortunately, the output I get is the following: <span
style="font-family:Symbol;">Ã"</span>.  Note that there are funny characters
in the value of the span tag instead of "&#212;", which is what I expected.

It looks like the HTML character entity reference was converted to a
different character format at some point.  I suspect it was XMLSerializer
doing this, but I suppose it can be happening on parsing as well.

Is there a way to stop the translation and to have the output be "&#212;"?

Thanks,
Michael