You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@struts.apache.org by Gus Heck <gu...@olin.edu> on 2003/12/11 23:20:19 UTC
Problems w?th character convers?on
So I have been prototyping stuff for a project in struts and I decided
on a whim to see how well it handled unusual international text. So I
swapped over to a Turkish keyboard which produces several quite unusual
characters and typed in a couple place names I know give most programs
fits because they contain a capital dotted I which looks like this: I.
(set your encoding to Turkish if necessary to view it, it should look
like a capital i with a dot above it)
What I got back when the results of the form entry were displayed was
&#304;
I am pleased to see that nothing in struts, or elsewhere in my program
actually died but it would be just slightly cooler if it actually came
back with İ (which renders correct, at least in Mozilla). The
problem is that İ appears to be what is stored in the database,
(MySQL 4.0) and it gets re-escaped on the way back out. (the MySQL shell
does funny things with the double dotted u (ü)and c (ç) with circumflex
but those are at least recorded straight up, and not escaped in either
direction. I did some searching and found this:
http://www.faqchest.com/prgm/tomcat-l/tmct-01/tmct-0111/tmct-011153/tmct01112715_24444.html
but that didn't seem to solve the problem. I'm sure this has been solved
already, I just can't find it. Support for turkish or for that matter
any other language is not critical to my project but if there is a not
too painful solution it would be nice to drop it in.
-Gus
---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org
Re: Problems with character conversion [partial solution]
Posted by Jason Lea <ja...@kumachan.net.nz>.
This is probably what you are looking for:
http://www.anassina.com/struts/i18n/i18n.html
The main point here is it tells the browser that the content is UTF-8
instead of the default Latin-1 encoding. It also includes the filter
you used earlier to tell java/servlet that it should expect UTF-8
encoded content from forms.
Gus Heck wrote:
>Well after a little poking around, I discovered that it was the
>browser's fault. Setting the character encoding that the browser is
>using allows it to send the character without & escaping it, but then
>one has to also set the encoding when viewing it later. I am wondering
>if there is a way to support the full range of characters in unicode
>without having to set the encoding... The app I am building will not be
>internationalized, but we may want to store some international names and
>places in the DB and it would be nice if they could display correctly
>rather than resorting to anglecized spellings, which might be annoying
>to some people who's names or addresses will be mis-spelled.
>
>Anyone know how to do this?
>
>Gus Heck wrote:
>
>
>
>>>programs fits because they contain a capital dotted I which looks
>>>like this: I. (set your encoding to Turkish if necessary to view it,
>>>it should look like a capital i with a dot above it)
>>>
>>>
>>heh it seems to give my mailer fits too, as it got converted to I with
>>a period after it.... :)
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: struts-user-help@jakarta.apache.org
>>
>>
>>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: struts-user-help@jakarta.apache.org
>
>
>
>
--
Jason Lea
---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org
Re: Problems with character conversion [partial solution]
Posted by Gus Heck <gu...@olin.edu>.
Well after a little poking around, I discovered that it was the
browser's fault. Setting the character encoding that the browser is
using allows it to send the character without & escaping it, but then
one has to also set the encoding when viewing it later. I am wondering
if there is a way to support the full range of characters in unicode
without having to set the encoding... The app I am building will not be
internationalized, but we may want to store some international names and
places in the DB and it would be nice if they could display correctly
rather than resorting to anglecized spellings, which might be annoying
to some people who's names or addresses will be mis-spelled.
Anyone know how to do this?
Gus Heck wrote:
>
>> programs fits because they contain a capital dotted I which looks
>> like this: I. (set your encoding to Turkish if necessary to view it,
>> it should look like a capital i with a dot above it)
>
>
> heh it seems to give my mailer fits too, as it got converted to I with
> a period after it.... :)
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: struts-user-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org
Re: Problems w?th character convers?on
Posted by Gus Heck <gu...@olin.edu>.
> programs fits because they contain a capital dotted I which looks like
> this: I. (set your encoding to Turkish if necessary to view it, it
> should look like a capital i with a dot above it)
heh it seems to give my mailer fits too, as it got converted to I with a
period after it.... :)
---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org