You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by "Bazeley, John" <jb...@bloomberg.com> on 2005/07/08 10:55:14 UTC

Stream Generator / uploading UTF-8 encoded chinese files

Hi all,

I'm trying to use the stream generator to upload XML files that 
are UTF-8 encoded and contain chinese characters. Source system
is Windows XP and Cocoon is v2.1.7 running on Solaris 9 / Java
1.4.2. Whether I use my own pipeline with curl uploading the file
or the /samples/stream/process-order pipeline, the results are 
the same: the file is returned to me with all the chinese 
characters mangled ('od' shows all the Chinese characters have 
been converted to 357 277 275).

I have inserted debug into the stream generator and the XML 
serialiser, and both think they are using UTF-8 encoding. 

Why is my document getting corrupted? What am I doing wrong?

The source document has 'encoding="UTF-8"' in the <?xml ... string, 
and IE and Firefox both display it correctly and tell me the encoding 
is UTF-8, so I am inclined to believe the document is correctly 
encoded.

All suggestions are welcome.

Thanks, John


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Stream Generator / uploading UTF-8 encoded chinese files

Posted by Lionel Crine <cr...@4dconcept.fr>.
Hi,

you can configure the encoding like this :
Did you configure the <form-encoding> in web.xml ?
Did you try using the action :  setCharacterEncoding (at the start of 
you pipeline) ?

Did you open your document with Ultraedit to see what's the encoding ?


Lionel



Bazeley, John wrote:

>Hi all,
>
>I'm trying to use the stream generator to upload XML files that 
>are UTF-8 encoded and contain chinese characters. Source system
>is Windows XP and Cocoon is v2.1.7 running on Solaris 9 / Java
>1.4.2. Whether I use my own pipeline with curl uploading the file
>or the /samples/stream/process-order pipeline, the results are 
>the same: the file is returned to me with all the chinese 
>characters mangled ('od' shows all the Chinese characters have 
>been converted to 357 277 275).
>
>I have inserted debug into the stream generator and the XML 
>serialiser, and both think they are using UTF-8 encoding. 
>
>Why is my document getting corrupted? What am I doing wrong?
>
>The source document has 'encoding="UTF-8"' in the <?xml ... string, 
>and IE and Firefox both display it correctly and tell me the encoding 
>is UTF-8, so I am inclined to believe the document is correctly 
>encoded.
>
>All suggestions are welcome.
>
>Thanks, John
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
>For additional commands, e-mail: users-help@cocoon.apache.org
>
>  
>

-- 
Lionel CRINE
Ingénieur Systèmes documentaires
Société : 4DConcept
22 rue Etienne de Jouy 78353 JOUY EN JOSAS
Tel : 01.34.58.70.70 Fax : 01.39.46.06.90