You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by Niels Verdonk <Ni...@ctp.com> on 2002/07/12 13:11:36 UTC

Strange encoding problem

Hi,

I've got a problem with latin-5 encoding. (latin-5 is an extension of
latin-1 which replaces 6 icelandic characters with turkish ones.)

I export UTF-8 from the database to an XML document (using an
OutputStreamWriter with encoding UTF8).

When I view the XML document with ISO-8859-9 encoding I see the turkish
characters properly. This document is transformed to FO using XSL ,using
UTF-8 encoding by setting:
<xsl:output method="xml" encoding="utf-8"/>

Again when I view this document with ISO-8859-9 ecoding I see the
turkish characters properly.

But in the PDF generated, the charaters are displayed as the latin-1
icelandic characters.

Can anyone tell me where I'm going wrong?

Thanks,

Niels

Re: Strange encoding problem

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Niels Verdonk wrote:
> I've got a problem with latin-5 encoding. (latin-5 is an extension of
> latin-1 which replaces 6 icelandic characters with turkish ones.)
> 
> I export UTF-8 from the database to an XML document (using an
> OutputStreamWriter with encoding UTF8).
> 
> When I view the XML document with ISO-8859-9 encoding I see the turkish
> characters properly.

What does "I view the XML document with ISO-8859-9 encoding" mean?
Do you use an editor or browser where you can choose the encoding?
If you saved the file acutally with UTF-8 encoding, you won't see
any non-ASCII characters properly.

> This document is transformed to FO using XSL ,using
> UTF-8 encoding by setting:
> <xsl:output method="xml" encoding="utf-8"/>
> 
> Again when I view this document with ISO-8859-9 ecoding I see the
> turkish characters properly.

This should not happen: if the document is UTF-8 encoded, and
you use a viewer which thinks it is ISO-8859-9 encoded, you wont
see the expected characters, or at least some unexpected characters
around them.
Note that XML editors are smart and use the encoding they find in
the XML file, not the encoding you tell them to use.

> But in the PDF generated, the charaters are displayed as the latin-1
> icelandic characters.
> 
Either you declare the encoding of the XML wrong, or the parser
doesn't understand the encoding and falls back to ISO-8859-1 for
some odd reason (shouldn't happen). Run FOP with the -d flag and
check for suspicious messages.

J.Pietschmann


Re: Strange encoding problem

Posted by Mathy V Arumugam <ma...@jpl.nasa.gov>.
I've been struggling with the same problem as well.  I am using 
'iso-8859-1' encoding both in my xsl and xml files.  Using a text editor I 
am able to view the symbols in xml file but the PDF file generated by FOP 
replaces all the symbols with a '#' sign.

Any advice!!
Mathy

At 7/12/2002 04:11 AM, Niels Verdonk wrote:
>Hi,
>
>I've got a problem with latin-5 encoding. (latin-5 is an extension of
>latin-1 which replaces 6 icelandic characters with turkish ones.)
>
>I export UTF-8 from the database to an XML document (using an
>OutputStreamWriter with encoding UTF8).
>
>When I view the XML document with ISO-8859-9 encoding I see the turkish
>characters properly. This document is transformed to FO using XSL ,using
>UTF-8 encoding by setting:
><xsl:output method="xml" encoding="utf-8"/>
>
>Again when I view this document with ISO-8859-9 ecoding I see the
>turkish characters properly.
>
>But in the PDF generated, the charaters are displayed as the latin-1
>icelandic characters.
>
>Can anyone tell me where I'm going wrong?
>
>Thanks,
>
>Niels