You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Steve Cohen <sc...@javactivity.org> on 2005/04/07 03:09:42 UTC
Is there an Apache or java standard for expressing non-English String
literals
Neeme Praks wrote:
> Also, I noticed that your java sources are in some strange encoding. If
> I open those tests that use french letters in my Eclipse and save them
> then they become corrupt and will fail.
> My configuration assumes that all source files are in UTF8 and I think
> that should be the most reasonable assumption, no?
The files in question here are
org.apache.commons.net.ftp.parser.FTPTimestampParserImplTest.java
and
org.apache.commons.net.ftp.FTPClientConfig.java
in the jakarta-commons-net project.
Mr. Praks is correctly pointing out that my test code (and other source
code) depends sometimes on typing string literals in languages other
than English. What is the CORRECT way to handle this in source code,
and what can I do to make editors such as Eclipse handle it correctly?
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: Is there an Apache or java standard for expressing non-English
String literals
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
> Mr. Praks is correctly pointing out that my test code (and other
> source code) depends sometimes on typing string literals in languages
> other than English. What is the CORRECT way to handle this in source
> code, and what can I do to make editors such as Eclipse handle it
> correctly?
Why not move to UTF-8?
Else you might have to use "Unicode Escapes". Take a look at:
http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: Is there an Apache or java standard for expressing non-English
String literals
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
> Would others agree with this? Is the best editor setting for editing
> code where i18n could be an issue to set the editor to UTF-8 or is it
> better to leave it at its default local setting? What are the pros
> and cons here? Had I set the editor for UTF-8 would I not have had
> these issues? Or is it best to consciously code with explicit unicode
> escapes to avoid these issues on ANYONE's editor? Or both?
I think the best is to encode source in UTF-8.
I am not sure about unicode escapes, maybe it might be best to be sure
no wrong configured ide could destroy sensible data.
For javadoc I dont want to use this unicode-escapes, if you browse
through the source it is bad to read.
What needs to be addressed to is the target encoding. If I recall
correctly you could set the source-encoding and the target(?)-encoding
of a source file. e.g. If you compile your source for target UTF-8 then
you might have i18n issues again.
But I think we could ignore this problem and use UTF-8 for target
encoding. There might only be a problem if we output utf-8 encoded
string literals hardcoded into the source what is not very common.
However, If it happens and it is not acceptable for a user it is easy to
recompile a library with the desired encoding.
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: Is there an Apache or java standard for expressing non-English
String literals
Posted by Steve Cohen <sc...@javactivity.org>.
Would others agree with this? Is the best editor setting for editing
code where i18n could be an issue to set the editor to UTF-8 or is it
better to leave it at its default local setting? What are the pros and
cons here? Had I set the editor for UTF-8 would I not have had these
issues? Or is it best to consciously code with explicit unicode escapes
to avoid these issues on ANYONE's editor? Or both?
Neeme Praks wrote:
>
> Yes, now it should be ok.
> However, I would advise to change it anyway - all platform specific
> settings are Bad(tm).
> :-)
>
> Steve Cohen wrote:
>
>> Okay, found it. I assume, though, that it is not necessary to change
>> this, now that I have replaced all the non-ASCII chars with unicode
>> equivalents. Or am I still missing something?
>>
>>
>> Neeme Praks wrote:
>>
>>> Window -> Preferences -> Workbench -> Editors
>>>
>>> and there is "Text file encoding", can be (platform) default or custom.
>>>
>>> Rgds,
>>> Neeme
>>>
>>> Steve Cohen wrote:
>>>
>>>> However, when you say "that depends on your file encoding", where is
>>>> THAT defined, actually? I looked through all the Eclipse options
>>>> and found nothing indicating option to change encodings.
>>>> Presumably, other editors I might use might have some other place to
>>>> define this.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>>>> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>>>>
>>>>
>>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>>
>>
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: Is there an Apache or java standard for expressing non-English
String literals
Posted by Neeme Praks <ne...@apache.org>.
Yes, now it should be ok.
However, I would advise to change it anyway - all platform specific
settings are Bad(tm).
:-)
Steve Cohen wrote:
> Okay, found it. I assume, though, that it is not necessary to change
> this, now that I have replaced all the non-ASCII chars with unicode
> equivalents. Or am I still missing something?
>
>
> Neeme Praks wrote:
>
>> Window -> Preferences -> Workbench -> Editors
>>
>> and there is "Text file encoding", can be (platform) default or custom.
>>
>> Rgds,
>> Neeme
>>
>> Steve Cohen wrote:
>>
>>> However, when you say "that depends on your file encoding", where is
>>> THAT defined, actually? I looked through all the Eclipse options
>>> and found nothing indicating option to change encodings.
>>> Presumably, other editors I might use might have some other place to
>>> define this.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>>> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>>>
>>>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>
>
Re: Is there an Apache or java standard for expressing non-English
String literals
Posted by Steve Cohen <sc...@javactivity.org>.
Okay, found it. I assume, though, that it is not necessary to change
this, now that I have replaced all the non-ASCII chars with unicode
equivalents. Or am I still missing something?
Neeme Praks wrote:
> Window -> Preferences -> Workbench -> Editors
>
> and there is "Text file encoding", can be (platform) default or custom.
>
> Rgds,
> Neeme
>
> Steve Cohen wrote:
>
>> However, when you say "that depends on your file encoding", where is
>> THAT defined, actually? I looked through all the Eclipse options and
>> found nothing indicating option to change encodings. Presumably,
>> other editors I might use might have some other place to define this.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>>
>>
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: Is there an Apache or java standard for expressing non-English
String literals
Posted by Neeme Praks <ne...@apache.org>.
Window -> Preferences -> Workbench -> Editors
and there is "Text file encoding", can be (platform) default or custom.
Rgds,
Neeme
Steve Cohen wrote:
> However, when you say "that depends on your file encoding", where is
> THAT defined, actually? I looked through all the Eclipse options and
> found nothing indicating option to change encodings. Presumably,
> other editors I might use might have some other place to define this.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>
>
Re: Is there an Apache or java standard for expressing non-English
String literals
Posted by Steve Cohen <sc...@javactivity.org>.
robert burrell donkin wrote:
> On 7 Apr 2005, at 02:09, Steve Cohen wrote:
>
>> Neeme Praks wrote:
>>
>>> Also, I noticed that your java sources are in some strange encoding.
>>> If I open those tests that use french letters in my Eclipse and save
>>> them then they become corrupt and will fail.
>>> My configuration assumes that all source files are in UTF8 and I
>>> think that should be the most reasonable assumption, no?
>>
>>
>> The files in question here are
>> org.apache.commons.net.ftp.parser.FTPTimestampParserImplTest.java
>> and
>> org.apache.commons.net.ftp.FTPClientConfig.java
>> in the jakarta-commons-net project.
>>
>> Mr. Praks is correctly pointing out that my test code (and other
>> source code) depends sometimes on typing string literals in languages
>> other than English. What is the CORRECT way to handle this in source
>> code, and what can I do to make editors such as Eclipse handle it
>> correctly?
>
>
> that depends on your file encoding :)
>
> if you use UFT-8 (which is typical) it's safest to use unicode escaping
> when dealing with any non-ascii characters.
>
> - robert
>
>
That's what I have done to fix this. I converted all the non-ASCII
chars (and also the HTML-escaped non-ASCIIs in the javadoc comments) to
unicode. Javadoc, apparently converts them back to HTML escaped chars
when it creates the HTML.
However, when you say "that depends on your file encoding", where is
THAT defined, actually? I looked through all the Eclipse options and
found nothing indicating option to change encodings. Presumably, other
editors I might use might have some other place to define this.
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: Is there an Apache or java standard for expressing non-English String literals
Posted by robert burrell donkin <rd...@apache.org>.
On 7 Apr 2005, at 02:09, Steve Cohen wrote:
> Neeme Praks wrote:
>
>> Also, I noticed that your java sources are in some strange encoding.
>> If I open those tests that use french letters in my Eclipse and save
>> them then they become corrupt and will fail.
>> My configuration assumes that all source files are in UTF8 and I
>> think that should be the most reasonable assumption, no?
>
> The files in question here are
> org.apache.commons.net.ftp.parser.FTPTimestampParserImplTest.java
> and
> org.apache.commons.net.ftp.FTPClientConfig.java
> in the jakarta-commons-net project.
>
> Mr. Praks is correctly pointing out that my test code (and other
> source code) depends sometimes on typing string literals in languages
> other than English. What is the CORRECT way to handle this in source
> code, and what can I do to make editors such as Eclipse handle it
> correctly?
that depends on your file encoding :)
if you use UFT-8 (which is typical) it's safest to use unicode escaping
when dealing with any non-ascii characters.
- robert
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org