You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Alessandro Carrara <al...@gmail.com> on 2010/12/20 12:31:43 UTC

Iussue not usual char

Hi,
use Jena for a project in Java.
I have a problem in the management of some characters.

When I try to create an attribute in XML encoding UTF-8, I pass a value as a
String with special characters.

I find the following result:

In java:
resource.addProperty (property, "SÜDBUR");

In xml:
<js:property> SÃ? DBUR </ js: property>

Can you help me?

thanks

Re: Iussue not usual char

Posted by Andy Seaborne <an...@epimorphics.com>.

On 20/12/10 11:31, Alessandro Carrara wrote:
> Hi,
> use Jena for a project in Java.
> I have a problem in the management of some characters.
>
> When I try to create an attribute in XML encoding UTF-8, I pass a value as a
> String with special characters.
>
> I find the following result:
>
> In java:
> resource.addProperty (property, "SÜDBUR");


>
> In xml:
> <js:property>  SÃ? DBUR</ js: property>


It looks like you are viewing it as ISO-8859-1.

It all depends on what program you are using to view the XML.

Ü is Unicode \u00DC
In UTF-8 that encodes as byte C3 9C

In ISO-8859-1, C3 is à and 9C is a non-printing character (hence "? ").

Does the XML start with encoding declaration? If it does not, or if it's 
utf-8 then your XML is probably OK, and it's just the way you ar looking 
at the file.

<?xml version='1.0' encoding='utf-8'?>

If it starts

<?xml version='1.0' encoding='iso-8859-1'?>
or some such, then the file data is likely corrupt.

	Andy

Do not use Java's FileWriter - it makes the encoding the platform 
default, and that is often not UTF-8.

>
> Can you help me?
>
> thanks
>