You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by "J.Pietschmann" <j3...@yahoo.de> on 2003/04/01 19:05:46 UTC

Re: Problem with indent

g[R]eK wrote:
> Problem is caused by the entites like "&oacute;" because its size is 8 bytes,
> but character ó have size 1 or 2 bytes. It is big difference, when ó character
> is repeating much times.
> I hope, you know what I say?
An "encoding problem" usually refers to mismatches regarding the mapping of
Unicode characters to bytes in the output.
Your problem, that the serializer maps characters to predefined HTML entities,
is somewhat trickier, and there is no standardized way to deal with it.

Cocoon uses an identity XML transformation for serialization, usually performed
by Xalan (default setting). You can have a look into the Xalan docs and search
for extensions to the xsl:output element which might solve your problem, or ask
on the Xalan list. There is also a properties file for the HTML entities, you
can provide a modified version which may cause Xalan to output UTF-8 encoded
bytes or at least character referencces (which are a bit shorter).

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-users-unsubscribe@xml.apache.org
For additional commands, e-mail: cocoon-users-help@xml.apache.org