You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tapestry.apache.org by Foror <fo...@mail.ru> on 2007/07/07 11:37:31 UTC
Russian symbols problems
I'm using message properties files in UTF-8 with russian symbols, for
T4 it's work, but T5 it's not work - russian symbols is damaged
Also problem with @ApplicationState, after save state and back to form,
russian as well is damaged
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org
Re: Russian symbols problems
Posted by Ulrich Stärk <ul...@spielviel.de>.
Have you tried this:
http://wiki.apache.org/tapestry/Tapestry5Utf8Encoding
This will ensure that pages served by Tapestry are UTF8 encoded. For me
this was enough to make T5 serve German UTF8 characters correctly.
Uli
Foror schrieb:
> I'm using message properties files in UTF-8 with russian symbols, for
> T4 it's work, but T5 it's not work - russian symbols is damaged
>
> Also problem with @ApplicationState, after save state and back to form,
> russian as well is damaged
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> For additional commands, e-mail: users-help@tapestry.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org
Re: Russian symbols problems
Posted by Steven Coco <co...@stevencoco.com>.
Hi.
I may have missed the original question a little. I should probably have a
look at the source at this point.
First, is this about Tapestry 5 or an earlier version? I was assuming this was
about T5: if not I don't think anything I've said applies!
I do understand Java's character handling. I, for some reason that I don't see
in the message now, thought you were specifically talking about .properties
files. What I brought up is that, first, if you are specifically using either
ResourceBundle, or Properties directly, then Java does in fact pin those
files at ISO 8859-1, regardless of the platform encoding: see the
documentation for util.Properties. And note that this applies to reading
local files only.
But the T5 documentation said somewhere -- and now I can't find that
either! -- that T5 supported storing the .properties files in the platform
encoding. So assuming those things, my question was about how T5 enabled that
feature: whether you could use the native encoding only if you were loading
messages with T5's Messages class; or if T5 injected some
ResourceBundle.Control, or other feature, that switched the .properties file
encoding globally, so any PropertiesResourceBundle or Properties you loaded
would use the native encoding.
So specifically:
> This means that whe resource bundle is loaded it needs to convert
> bytes in the file to the characters. By default JVM is using platform
> encoding when loading files, but you can override this settings by
> specifying
>
> -Dfile.encoding=utf8
The file encoding does apply to default Readers and Writers, but specifically,
it does not apply to Properties and ResourceBundles, and so this would imply
that T5 does do something to enable this. But, maybe you weren't even talking
about .properties files all along? Maybe you meant other local files in
general and not specifically message bundles.
But then about the output, you said:
> Solution proposed in Wiki change the output stream encoding that
> renders java characters to the network.
The output to the client should nothing to do with any of this, unless you are
saying that T5 switches the output encoding based on the local platform
encoding?
I'm a little lost.
I am also new to WebApp development: it may be a universal feature for WebApp
frameworks to allow local .properties files to be written in encodings other
than 8859-1, but I'd find that dumb (no offense intended). It certainly has
nothing to do with the SE platform behavior. But you would run a strange risk
of incompatibilities that way. What happens if I specify a key with extended
Unicode characters in it? And I guess it can't be changed globally, since
that would break all the other conformant .properties files in the JDK and
other libraries. Maybe it's per-ClassLoader. Unless you just like getting
lucky!
That's partly why the XML properties format exists -- you can specify any
encoding you want, per-file if you like -- and the ResourceBundle.Control
facility... But anyway, as I said, I'm afraid I'm a little lost from the real
question, so this may not be of much use to you.
But in any case, I care less about the specifics then about the fact that you
do get any problems you have resolved!
Ciao!
-Steev Coco.
On Wed July 11 2007 1:45:39 pm you wrote:
> The basic JVM and Java feature (independent of Tapestry or any other
> framework) as we all know that every character is represented as two
> bytes, it means as soon as byte-to-character conversion happened
> characters can't be truncated anymore inside JVM.
>
> The major points where data loss can happen is IO - where data is
> transformed in a set of bytes. Such pain points are filesystem,
> network, etc.
>
> This means that whe resource bundle is loaded it needs to convert
> bytes in the file to the characters. By default JVM is using platform
> encoding when loading files, but you can override this settings by
> specifying
>
> -Dfile.encoding=utf8
>
> when running your JVM.
> Solution proposed in Wiki change the output stream encoding that
> renders java characters to the network.
>
> Hope this will help.
>
> Renat
>
> On 09/07/07, Steven Coco <co...@stevencoco.com> wrote:
> > This is somewhat intersting to me.
> >
> > This is the expected, usual, Java behavior (the .properties files must be
> > in ISO 8895-1 encoding); but I thought I read somewhere that T5 allowed
> > one to store the .properties files in UTF-8 encoding. Does this imply
> > that is not so, or perhaps this just has not yet been implemented?
> >
> > Can anyone say more about this feature? Would it apply only to instances
> > of Messages: and not if I manually load a ResourceBundle? That seems
> > likely. It's a pretty odd feature though, somewhat off the beaten Java
> > path (and I am kind of skeptical of it since it seems to bring some
> > amount of required administration with it).
> >
> > Also, if it would indeed rely on the server's platform encoding, then
> > that might make the T5 WebApp out of spec, since it wouldn't be
> > portable...
> >
> > Igor: good luck with this.
> >
> > While the topic is here, has anyone explored how T5 would handle an
> > attempt to specify a custom ResourceBundle.Control object? Or is there a
> > plan already in place? This is not currently critical at all for me,
> > thought I was an immediate fan of the XML properties files, and use them
> > myself in desktop development.
> >
> > Ciao!
> > -Steev Coco.
> >
> >
> > On Sat July 7 2007 10:11:18 am "Igor Drobiazko"
> > <ig...@gmail.com>
> >
> > wrote:
> > > Hi,
> > >
> > > you have 2 possibilities: either convert your property files into
> > > native-encoded characters or start you server in UTF-8 mode.
> > >
> > > http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/native2ascii.html
> > >
> > > On 7/7/07, Foror <fo...@mail.ru> wrote:
> > > > I'm using message properties files in UTF-8 with russian symbols, for
> > > > T4 it's work, but T5 it's not work - russian symbols is damaged
> > > >
> > > > Also problem with @ApplicationState, after save state and back to
> > > > form, russian as well is damaged
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> > > > For additional commands, e-mail: users-help@tapestry.apache.org
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> > For additional commands, e-mail: users-help@tapestry.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org
Re: Russian symbols problems
Posted by Renat Zubairov <re...@gmail.com>.
The basic JVM and Java feature (independent of Tapestry or any other
framework) as we all know that every character is represented as two
bytes, it means as soon as byte-to-character conversion happened
characters can't be truncated anymore inside JVM.
The major points where data loss can happen is IO - where data is
transformed in a set of bytes. Such pain points are filesystem,
network, etc.
This means that whe resource bundle is loaded it needs to convert
bytes in the file to the characters. By default JVM is using platform
encoding when loading files, but you can override this settings by
specifying
-Dfile.encoding=utf8
when running your JVM.
Solution proposed in Wiki change the output stream encoding that
renders java characters to the network.
Hope this will help.
Renat
On 09/07/07, Steven Coco <co...@stevencoco.com> wrote:
> This is somewhat intersting to me.
>
> This is the expected, usual, Java behavior (the .properties files must be in
> ISO 8895-1 encoding); but I thought I read somewhere that T5 allowed one to
> store the .properties files in UTF-8 encoding. Does this imply that is not
> so, or perhaps this just has not yet been implemented?
>
> Can anyone say more about this feature? Would it apply only to instances of
> Messages: and not if I manually load a ResourceBundle? That seems likely.
> It's a pretty odd feature though, somewhat off the beaten Java path (and I am
> kind of skeptical of it since it seems to bring some amount of required
> administration with it).
>
> Also, if it would indeed rely on the server's platform encoding, then that
> might make the T5 WebApp out of spec, since it wouldn't be portable...
>
> Igor: good luck with this.
>
> While the topic is here, has anyone explored how T5 would handle an attempt to
> specify a custom ResourceBundle.Control object? Or is there a plan already in
> place? This is not currently critical at all for me, thought I was an
> immediate fan of the XML properties files, and use them myself in desktop
> development.
>
> Ciao!
> -Steev Coco.
>
>
> On Sat July 7 2007 10:11:18 am "Igor Drobiazko" <ig...@gmail.com>
> wrote:
> > Hi,
> >
> > you have 2 possibilities: either convert your property files into
> > native-encoded characters or start you server in UTF-8 mode.
> >
> > http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/native2ascii.html
> >
> > On 7/7/07, Foror <fo...@mail.ru> wrote:
> > > I'm using message properties files in UTF-8 with russian symbols, for
> > > T4 it's work, but T5 it's not work - russian symbols is damaged
> > >
> > > Also problem with @ApplicationState, after save state and back to form,
> > > russian as well is damaged
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> > > For additional commands, e-mail: users-help@tapestry.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> For additional commands, e-mail: users-help@tapestry.apache.org
>
>
--
Best regards,
Renat Zubairov
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org
Re: Russian symbols problems
Posted by Steven Coco <co...@stevencoco.com>.
This is somewhat intersting to me.
This is the expected, usual, Java behavior (the .properties files must be in
ISO 8895-1 encoding); but I thought I read somewhere that T5 allowed one to
store the .properties files in UTF-8 encoding. Does this imply that is not
so, or perhaps this just has not yet been implemented?
Can anyone say more about this feature? Would it apply only to instances of
Messages: and not if I manually load a ResourceBundle? That seems likely.
It's a pretty odd feature though, somewhat off the beaten Java path (and I am
kind of skeptical of it since it seems to bring some amount of required
administration with it).
Also, if it would indeed rely on the server's platform encoding, then that
might make the T5 WebApp out of spec, since it wouldn't be portable...
Igor: good luck with this.
While the topic is here, has anyone explored how T5 would handle an attempt to
specify a custom ResourceBundle.Control object? Or is there a plan already in
place? This is not currently critical at all for me, thought I was an
immediate fan of the XML properties files, and use them myself in desktop
development.
Ciao!
-Steev Coco.
On Sat July 7 2007 10:11:18 am "Igor Drobiazko" <ig...@gmail.com>
wrote:
> Hi,
>
> you have 2 possibilities: either convert your property files into
> native-encoded characters or start you server in UTF-8 mode.
>
> http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/native2ascii.html
>
> On 7/7/07, Foror <fo...@mail.ru> wrote:
> > I'm using message properties files in UTF-8 with russian symbols, for
> > T4 it's work, but T5 it's not work - russian symbols is damaged
> >
> > Also problem with @ApplicationState, after save state and back to form,
> > russian as well is damaged
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> > For additional commands, e-mail: users-help@tapestry.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org
Re: Russian symbols problems
Posted by Igor Drobiazko <ig...@gmail.com>.
Hi,
you have 2 possibilities: either convert your property files into
native-encoded characters or start you server in UTF-8 mode.
http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/native2ascii.html
On 7/7/07, Foror <fo...@mail.ru> wrote:
>
> I'm using message properties files in UTF-8 with russian symbols, for
> T4 it's work, but T5 it's not work - russian symbols is damaged
>
> Also problem with @ApplicationState, after save state and back to form,
> russian as well is damaged
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> For additional commands, e-mail: users-help@tapestry.apache.org
>
>