You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tapestry.apache.org by Foror <fo...@mail.ru> on 2007/07/07 11:37:31 UTC

Russian symbols problems

I'm using message properties files in UTF-8 with russian symbols, for
T4 it's work, but T5 it's not work - russian symbols is damaged

Also problem with @ApplicationState, after save state and back to form,
russian as well is damaged


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: Russian symbols problems

Posted by Ulrich Stärk <ul...@spielviel.de>.
Have you tried this:

http://wiki.apache.org/tapestry/Tapestry5Utf8Encoding

This will ensure that pages served by Tapestry are UTF8 encoded. For me 
this was enough to make T5 serve German UTF8 characters correctly.

Uli

Foror schrieb:
> I'm using message properties files in UTF-8 with russian symbols, for
> T4 it's work, but T5 it's not work - russian symbols is damaged
> 
> Also problem with @ApplicationState, after save state and back to form,
> russian as well is damaged
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> For additional commands, e-mail: users-help@tapestry.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: Russian symbols problems

Posted by Steven Coco <co...@stevencoco.com>.
Hi.

I may have missed the original question a little. I should probably have a 
look at the source at this point.

First, is this about Tapestry 5 or an earlier version? I was assuming this was 
about T5: if not I don't think anything I've said applies!

I do understand Java's character handling. I, for some reason that I don't see 
in the message now, thought you were specifically talking about .properties 
files. What I brought up is that, first, if you are specifically using either 
ResourceBundle, or Properties directly, then Java does in fact pin those 
files at ISO 8859-1, regardless of the platform encoding: see the 
documentation for util.Properties. And note that this applies to reading 
local files only.

But the T5 documentation said somewhere -- and now I can't find that 
either! -- that T5 supported storing the .properties files in the platform
encoding. So assuming those things, my question was about how T5 enabled that 
feature: whether you could use the native encoding only if you were loading 
messages with T5's Messages class; or if T5 injected some 
ResourceBundle.Control, or other feature, that switched the .properties file 
encoding globally, so any PropertiesResourceBundle or Properties you loaded 
would use the native encoding.

So specifically:

> This means that whe resource bundle is loaded it needs to convert
> bytes in the file to the characters. By default JVM is using platform
> encoding when loading files, but you can override this settings by
> specifying
>
> -Dfile.encoding=utf8

The file encoding does apply to default Readers and Writers, but specifically, 
it does not apply to Properties and ResourceBundles, and so this would imply 
that T5 does do something to enable this. But, maybe you weren't even talking 
about .properties files all along? Maybe you meant other local files in 
general and not specifically message bundles.

But then about the output, you said:

> Solution proposed in Wiki change the output stream encoding that
> renders java characters to the network.

The output to the client should nothing to do with any of this, unless you are 
saying that T5 switches the output encoding based on the local platform 
encoding?

I'm a little lost.

I am also new to WebApp development: it may be a universal feature for WebApp 
frameworks to allow local .properties files to be written in encodings other 
than 8859-1, but I'd find that dumb (no offense intended). It certainly has 
nothing to do with the SE platform behavior. But you would run a strange risk 
of incompatibilities that way. What happens if I specify a key with extended 
Unicode characters in it? And I guess it can't be changed globally, since 
that would break all the other conformant .properties files in the JDK and 
other libraries. Maybe it's per-ClassLoader. Unless you just like getting 
lucky!

That's partly why the XML properties format exists -- you can specify any 
encoding you want, per-file if you like -- and the ResourceBundle.Control 
facility... But anyway, as I said, I'm afraid I'm a little lost from the real 
question, so this may not be of much use to you.

But in any case, I care less about the specifics then about the fact that you 
do get any problems you have resolved!

Ciao!
-Steev Coco.


On Wed July 11 2007 1:45:39 pm you wrote:
> The basic JVM and Java feature (independent of Tapestry or any other
> framework) as we all know that every character is represented as two
> bytes, it means as soon as byte-to-character conversion happened
> characters can't be truncated anymore inside JVM.
>
> The major points where data loss can happen is IO - where data is
> transformed in a set of bytes. Such pain points are filesystem,
> network, etc.
>
> This means that whe resource bundle is loaded it needs to convert
> bytes in the file to the characters. By default JVM is using platform
> encoding when loading files, but you can override this settings by
> specifying
>
> -Dfile.encoding=utf8
>
> when running your JVM.
> Solution proposed in Wiki change the output stream encoding that
> renders java characters to the network.
>
> Hope this will help.
>
> Renat
>
> On 09/07/07, Steven Coco <co...@stevencoco.com> wrote:
> > This is somewhat intersting to me.
> >
> > This is the expected, usual, Java behavior (the .properties files must be
> > in ISO 8895-1 encoding); but I thought I read somewhere that T5 allowed
> > one to store the .properties files in UTF-8 encoding. Does this imply
> > that is not so, or perhaps this just has not yet been implemented?
> >
> > Can anyone say more about this feature? Would it apply only to instances
> > of Messages: and not if I manually load a ResourceBundle? That seems
> > likely. It's a pretty odd feature though, somewhat off the beaten Java
> > path (and I am kind of skeptical of it since it seems to bring some
> > amount of required administration with it).
> >
> > Also, if it would indeed rely on the server's platform encoding, then
> > that might make the T5 WebApp out of spec, since it wouldn't be
> > portable...
> >
> > Igor: good luck with this.
> >
> > While the topic is here, has anyone explored how T5 would handle an
> > attempt to specify a custom ResourceBundle.Control object? Or is there a
> > plan already in place? This is not currently critical at all for me,
> > thought I was an immediate fan of the XML properties files, and use them
> > myself in desktop development.
> >
> > Ciao!
> > -Steev Coco.
> >
> >
> > On Sat July 7 2007 10:11:18 am "Igor Drobiazko"
> > <ig...@gmail.com>
> >
> > wrote:
> > > Hi,
> > >
> > > you have 2 possibilities: either convert your property files into
> > > native-encoded characters or start you server in UTF-8 mode.
> > >
> > > http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/native2ascii.html
> > >
> > > On 7/7/07, Foror <fo...@mail.ru> wrote:
> > > > I'm using message properties files in UTF-8 with russian symbols, for
> > > > T4 it's work, but T5 it's not work - russian symbols is damaged
> > > >
> > > > Also problem with @ApplicationState, after save state and back to
> > > > form, russian as well is damaged
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> > > > For additional commands, e-mail: users-help@tapestry.apache.org
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> > For additional commands, e-mail: users-help@tapestry.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: Russian symbols problems

Posted by Renat Zubairov <re...@gmail.com>.
The basic JVM and Java feature (independent of Tapestry or any other
framework) as we all know that every character is represented as two
bytes, it means as soon as byte-to-character conversion happened
characters can't be truncated anymore inside JVM.

The major points where data loss can happen is IO - where data is
transformed in a set of bytes. Such pain points are filesystem,
network, etc.

This means that whe resource bundle is loaded it needs to convert
bytes in the file to the characters. By default JVM is using platform
encoding when loading files, but you can override this settings by
specifying

-Dfile.encoding=utf8

when running your JVM.
Solution proposed in Wiki change the output stream encoding that
renders java characters to the network.

Hope this will help.

Renat
On 09/07/07, Steven Coco <co...@stevencoco.com> wrote:
> This is somewhat intersting to me.
>
> This is the expected, usual, Java behavior (the .properties files must be in
> ISO 8895-1 encoding); but I thought I read somewhere that T5 allowed one to
> store the .properties files in UTF-8 encoding. Does this imply that is not
> so, or perhaps this just has not yet been implemented?
>
> Can anyone say more about this feature? Would it apply only to instances of
> Messages: and not if I manually load a ResourceBundle? That seems likely.
> It's a pretty odd feature though, somewhat off the beaten Java path (and I am
> kind of skeptical of it since it seems to bring some amount of required
> administration with it).
>
> Also, if it would indeed rely on the server's platform encoding, then that
> might make the T5 WebApp out of spec, since it wouldn't be portable...
>
> Igor: good luck with this.
>
> While the topic is here, has anyone explored how T5 would handle an attempt to
> specify a custom ResourceBundle.Control object? Or is there a plan already in
> place? This is not currently critical at all for me, thought I was an
> immediate fan of the XML properties files, and use them myself in desktop
> development.
>
> Ciao!
> -Steev Coco.
>
>
> On Sat July 7 2007 10:11:18 am "Igor Drobiazko" <ig...@gmail.com>
> wrote:
> > Hi,
> >
> > you have 2 possibilities: either convert your property files into
> > native-encoded characters or start you server in UTF-8 mode.
> >
> > http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/native2ascii.html
> >
> > On 7/7/07, Foror <fo...@mail.ru> wrote:
> > > I'm using message properties files in UTF-8 with russian symbols, for
> > > T4 it's work, but T5 it's not work - russian symbols is damaged
> > >
> > > Also problem with @ApplicationState, after save state and back to form,
> > > russian as well is damaged
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> > > For additional commands, e-mail: users-help@tapestry.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> For additional commands, e-mail: users-help@tapestry.apache.org
>
>


-- 
Best regards,
Renat Zubairov

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: Russian symbols problems

Posted by Steven Coco <co...@stevencoco.com>.
This is somewhat intersting to me.

This is the expected, usual, Java behavior (the .properties files must be in 
ISO 8895-1 encoding); but I thought I read somewhere that T5 allowed one to 
store the .properties files in UTF-8 encoding. Does this imply that is not 
so, or perhaps this just has not yet been implemented?

Can anyone say more about this feature? Would it apply only to instances of 
Messages: and not if I manually load a ResourceBundle? That seems likely. 
It's a pretty odd feature though, somewhat off the beaten Java path (and I am 
kind of skeptical of it since it seems to bring some amount of required 
administration with it).

Also, if it would indeed rely on the server's platform encoding, then that 
might make the T5 WebApp out of spec, since it wouldn't be portable...

Igor: good luck with this.

While the topic is here, has anyone explored how T5 would handle an attempt to 
specify a custom ResourceBundle.Control object? Or is there a plan already in 
place? This is not currently critical at all for me, thought I was an 
immediate fan of the XML properties files, and use them myself in desktop 
development.

Ciao!
-Steev Coco.


On Sat July 7 2007 10:11:18 am "Igor Drobiazko" <ig...@gmail.com> 
wrote:
> Hi,
>
> you have 2 possibilities: either convert your property files into
> native-encoded characters or start you server in UTF-8 mode.
>
> http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/native2ascii.html
>
> On 7/7/07, Foror <fo...@mail.ru> wrote:
> > I'm using message properties files in UTF-8 with russian symbols, for
> > T4 it's work, but T5 it's not work - russian symbols is damaged
> >
> > Also problem with @ApplicationState, after save state and back to form,
> > russian as well is damaged
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> > For additional commands, e-mail: users-help@tapestry.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: Russian symbols problems

Posted by Igor Drobiazko <ig...@gmail.com>.
Hi,

you have 2 possibilities: either convert your property files into
native-encoded characters or start you server in UTF-8 mode.

http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/native2ascii.html

On 7/7/07, Foror <fo...@mail.ru> wrote:
>
> I'm using message properties files in UTF-8 with russian symbols, for
> T4 it's work, but T5 it's not work - russian symbols is damaged
>
> Also problem with @ApplicationState, after save state and back to form,
> russian as well is damaged
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> For additional commands, e-mail: users-help@tapestry.apache.org
>
>