You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Christopher Schultz <ch...@christopherschultz.net> on 2020/04/10 16:43:23 UTC

[OT?] Managing text-direction in Java

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

All,

This is likely off-topic because this isn't really about any
particular commons-* project, but I know there are Java-text wonks who
lurk here, so I figured this was a good place to ask. I'm happy to
relocate this question elsewhere if there's a better forum.

I'd like to know what the best practices are for managing
text-direction in Java.

It will help to give some background about my project's efforts thus far
.

We have two kinds of text in our web-based project:

1. Standard UI text, with replaceable parameters and such,
   in properties files. This text is generated by programmers, etc.

2. "Content" text, stored in an array of database tables. This text
   is entered by trained users through an administrative interface.
   There are some 3 dozen tables like this in our database.

We are using Struts 1.x, which has great support not only for resource
bundles, but also for resolving which resource bundle to use when the
best one isn't found. For example, I can ask for Portuguese and, not
finding it, actually discover that I've gotten English instead. That's
not something that e.g. Servlet API provides.

So category 1 above is handled, indirectly, through resource bundles,
and I can see which java.lang.Locale is represented by each.

Category 2 is handled by having a pair of fields in each database
table (storing text records): language and country. Each is a CHAR(2)
with the country being nullable. All of the code to move the text
to/from the database was written by our team using java.util.Locale as
a basis (ignoring the locale "variant").

[NB: I use separate language and country fields because we need to
search for records separately by each of those.]

When looking at how to properly-support right-to-left text such as
Arabic, Farsi, Urdu, Hebrew, etc., I discovered that using CSS this is
very easy from a UI perspective and I've developed some simple CSS
classes which effectively display the proper text-direction AND layout
which are appropriate... if I can determine the proper text-direction.

My first implementation just had a list of languages ([ar, fa, iw,
...]) that were always right-to-left. Looking up Locale.getLanguage()
in my RTL-language-list works great, and I can set a CSS class in my
HTML <body> element to get the page-layout and text-direction working
as desired.

Then I discovered that the above is all wrong: text-directionality
isn't determined by the /language/ but by the language /script/, of
which there can be some that do, indeed, disagree about
text-direction. So my list of languages doesn't work in all cases. I
need something else.

It seems that in the case of Category 1 above, this is simple: I can
simply create a bundle key called text-direction and let the bundle
decide. This has the advantage of the person writing the localized
bundle getting to make the decision as to what the text-direction
ought to be. I think I can call that problem "solved" at this point.

With Category 2, our standard operating procedure is to have a call
like this:

public FooClass.Text getFooText(Locale locale);

FooClass.Text looks like:

public Locale getLocale();
public String getText();

This allows us to do:

FooClass.Text text = getFooText(Locale.UK);
System.out.println(text.getLocale());

And have the output be "en_US" if there is text available for en_US
but not for en_UK.

But how to handle the directionality?

Looking at java.util.Locale for (hopefully) some direction confuses
me. First of all, Locale historically had 3 items: language, country,
variation. Language and country I already have. As I mentioned
previously, we've been ignoring variant. Maybe that will get me what I
need?

Maybe not.

The javadoc for Locale basically says that variant can be anything.
The same javaddoc mentions "script" but then says this about the 1-,
2-, and 3-argument constructors:

"
These constructors allow you to create a Locale object with language,
country and variant, but you cannot specify script or extensions.
"

Okay, so .. what do I do about script?

Maybe I've been failing to keep updated when it comes to the features
of java.util.Locale over the years. In the Java 1.7 (and later)
javadoc, I discover that java.util.Locale.Builder exists and it CAN
specify a script. Also, Locale has a "script" member, now, so I can:

Locale loc = Locale.Builder()
    .setLanguage(rs.getString("language"),
    .setCountry(rs.getString("country"),
    .setScript("rs.getString("script"))
    .build();

String script = loc.getScript();

Now I can grab the R-to-L scripts from the Wikipedia page for ISO
15924 and say:

if(rtlScripts.contains(loc.getScript()) {
  // do the right-to-left stuff
}

This means that I should be able to accomplish what I'm looking to do by
:

1. Adding a new column to all text-tables in my database for the
nullable "script" code.

2. Update my code that reads / writes those tables to include the
"script" in its work.

3. Consult the locale's "script" attribute combined with my list of
R-to-L scripts when determining text-directionality.

Does all that sound about right?

Thanks in advance,
- -chris

-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl6QoisACgkQHPApP6U8
pFgVnw/+JaTQZLr8aJPTXPFNQzfrnBZ7uaa/FKIezVCu2dMZrFEk/Ea00LUl+yCP
2pQdPgFQqnaIC+pC9/GEdxgsyXZW3liQDl6orKzzLUEF52wRbJL7eA4PEEmpa5bd
eqwi5Kv6rTlIm8/J2unNtdA9NQA87Yk2UskVLOmfO3dvs/0UbTe8YBLsqLt9Y01Z
qprn9109dHlHA/Dyx01ocrJpXtRyoMiqZNxuq0UVWWtTeSPYLHgnDCoTzs9HLqxB
SDqtld4xH1LZggUBMphXzRYL7YMhv7kBO5Z9SBpX9e+KYpZga8dSkx82WHx6Itv2
nBkNRsOHNFM+TDqgmBU640VaVm/z9L0Bb1PDe+a0TVQEo/S6T8isNbaDH/yuEgxm
dL4LtIzVVCVPWBOq6JVW3S0Rdcm1jRjlLJzoCmJpiKA6zWa5O1IIHlFFXrkiVPgi
Kz0oIl9oPcOyxVrIMNJMBt0qd0wfNvPqkLWyeym1tlcajpM3cpPFG4mDpMo8rzbE
+4NQi5UyJdEIdg5MINO6v11yjmY1nw/3HgdMlgdMb33MqDN9Yn7GBsNk5WfmNax0
Fm1vRnps1S5CpJygnrVE7/we0gytEQXoINCPpUJJyZi5YGD1dKUk3QvQ6+iEsasi
cu5izA28BwLaxgV7B941qYYH3u642fuqYgK8w/O8OhSChAcQ/3c=
=Yg7z
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: [OT?] Managing text-direction in Java

Posted by Rob Tompkins <ch...@gmail.com>.

> On Apr 20, 2020, at 12:40 PM, Christopher Schultz <ch...@christopherschultz.net> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> Rob,
> 
> On 4/10/20 13:26, Rob Tompkins wrote:
>> Curious…do you deal with bi-directional text ever (i.e. both
>> L-to-R and R-to-L variants in the same text block)? Just looking
>> over at the java.awt.font.* package and how it manages this.
>> Granted, it doesn’t do so using Locale’s (which generally make more
>> sense to me).
> 
> No current plans to handle bidi text, but I wouldn't want to
> necessarily rule it out if I could avoid it.
> 
>> I’m curious how cleanly Locale.script maps onto fonts over there
>> because they seem to be using fonts to sort out whether something
>> is L-to-R or R-to-L.
> Hmm. I won't be able to predict the font being used by the browser, of
> course.
> 
> I have discovered, unfortunately, that Java's "script" field is often
> empty even for languages known to be right-to-left, such as "ar"
> (Arabic). So I may have to implement my own list of scripts for a
> particular language, or at least maybe a default-script given a
> particular language. Or simply stick with the naive implementation I
> already have where a single language implies ltr/rtl which isn't
> perfect but ... perfect may be difficult to achieve.

Interesting, I’ll keep reading around.

Cheers,
-Rob

> 
> Thanks,
> - -chris
> 
>>> On Apr 10, 2020, at 12:43 PM, Christopher Schultz
> <ch...@christopherschultz.net> wrote:
>>> 
>> All,
>> 
>> This is likely off-topic because this isn't really about any
>> particular commons-* project, but I know there are Java-text wonks
>> who lurk here, so I figured this was a good place to ask. I'm
>> happy to relocate this question elsewhere if there's a better
>> forum.
>> 
>> I'd like to know what the best practices are for managing
>> text-direction in Java.
>> 
>> It will help to give some background about my project's efforts
>> thus far .
>> 
>> We have two kinds of text in our web-based project:
>> 
>> 1. Standard UI text, with replaceable parameters and such, in
>> properties files. This text is generated by programmers, etc.
>> 
>> 2. "Content" text, stored in an array of database tables. This text
>> is entered by trained users through an administrative interface.
>> There are some 3 dozen tables like this in our database.
>> 
>> We are using Struts 1.x, which has great support not only for
>> resource bundles, but also for resolving which resource bundle to
>> use when the best one isn't found. For example, I can ask for
>> Portuguese and, not finding it, actually discover that I've gotten
>> English instead. That's not something that e.g. Servlet API
>> provides.
>> 
>> So category 1 above is handled, indirectly, through resource
>> bundles, and I can see which java.lang.Locale is represented by
>> each.
>> 
>> Category 2 is handled by having a pair of fields in each database
>> table (storing text records): language and country. Each is a
>> CHAR(2) with the country being nullable. All of the code to move
>> the text to/from the database was written by our team using
>> java.util.Locale as a basis (ignoring the locale "variant").
>> 
>> [NB: I use separate language and country fields because we need to
>> search for records separately by each of those.]
>> 
>> When looking at how to properly-support right-to-left text such as
>> Arabic, Farsi, Urdu, Hebrew, etc., I discovered that using CSS
>> this is very easy from a UI perspective and I've developed some
>> simple CSS classes which effectively display the proper
>> text-direction AND layout which are appropriate... if I can
>> determine the proper text-direction.
>> 
>> My first implementation just had a list of languages ([ar, fa, iw,
>> ...]) that were always right-to-left. Looking up
>> Locale.getLanguage() in my RTL-language-list works great, and I
>> can set a CSS class in my HTML <body> element to get the
>> page-layout and text-direction working as desired.
>> 
>> Then I discovered that the above is all wrong: text-directionality
>> isn't determined by the /language/ but by the language /script/,
>> of which there can be some that do, indeed, disagree about
>> text-direction. So my list of languages doesn't work in all cases.
>> I need something else.
>> 
>> It seems that in the case of Category 1 above, this is simple: I
>> can simply create a bundle key called text-direction and let the
>> bundle decide. This has the advantage of the person writing the
>> localized bundle getting to make the decision as to what the
>> text-direction ought to be. I think I can call that problem
>> "solved" at this point.
>> 
>> With Category 2, our standard operating procedure is to have a call
>> like this:
>> 
>> public FooClass.Text getFooText(Locale locale);
>> 
>> FooClass.Text looks like:
>> 
>> public Locale getLocale(); public String getText();
>> 
>> This allows us to do:
>> 
>> FooClass.Text text = getFooText(Locale.UK);
>> System.out.println(text.getLocale());
>> 
>> And have the output be "en_US" if there is text available for en_US
>> but not for en_UK.
>> 
>> But how to handle the directionality?
>> 
>> Looking at java.util.Locale for (hopefully) some direction confuses
>> me. First of all, Locale historically had 3 items: language,
>> country, variation. Language and country I already have. As I
>> mentioned previously, we've been ignoring variant. Maybe that will
>> get me what I need?
>> 
>> Maybe not.
>> 
>> The javadoc for Locale basically says that variant can be anything.
>> The same javaddoc mentions "script" but then says this about the
>> 1-, 2-, and 3-argument constructors:
>> 
>> " These constructors allow you to create a Locale object with
>> language, country and variant, but you cannot specify script or
>> extensions. "
>> 
>> Okay, so .. what do I do about script?
>> 
>> Maybe I've been failing to keep updated when it comes to the
>> features of java.util.Locale over the years. In the Java 1.7 (and
>> later) javadoc, I discover that java.util.Locale.Builder exists
>> and it CAN specify a script. Also, Locale has a "script" member,
>> now, so I can:
>> 
>> Locale loc = Locale.Builder()
>> .setLanguage(rs.getString("language"),
>> .setCountry(rs.getString("country"),
>> .setScript("rs.getString("script")) .build();
>> 
>> String script = loc.getScript();
>> 
>> Now I can grab the R-to-L scripts from the Wikipedia page for ISO
>> 15924 and say:
>> 
>> if(rtlScripts.contains(loc.getScript()) { // do the right-to-left
>> stuff }
>> 
>> This means that I should be able to accomplish what I'm looking to
>> do by :
>> 
>> 1. Adding a new column to all text-tables in my database for the
>> nullable "script" code.
>> 
>> 2. Update my code that reads / writes those tables to include the
>> "script" in its work.
>> 
>> 3. Consult the locale's "script" attribute combined with my list of
>> R-to-L scripts when determining text-directionality.
>> 
>> Does all that sound about right?
>> 
>> Thanks in advance, -chris
>> 
>>> 
>>> ---------------------------------------------------------------------
>>> 
>>> 
>>> 
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: user-help@commons.apache.org
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> 
>> 
>> 
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>> For additional commands, e-mail: user-help@commons.apache.org
>> 
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
> 
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl6d0IoACgkQHPApP6U8
> pFjGIQ/+NvLf76QG8QRtwUuckTgkGr+9jTXJSTBXzRmiFWxUUn5jafVUD5EE7ZoB
> CkvgSsVUZOB/ByRtBB13ftJqpv05Auy4jkE3EZ5M2E1bfnnFdY8A9DdwKRkiPl5Y
> 0b1cQBNeQHABcjK0kuMVfCIf7NAicVDJOkBwelNqVW90PcWVhxy1q2fXrtqcWfkF
> jTar3mzEXrSpxGU5i9aBXyG8eA7vDuOUUqQzyvERv95ZcC56skbvubNI5GtbvlrI
> wH/OsYzF/ZA2a/Vfmr15PuS6PxEIeXx2ElCzMjZjB+TgYwa5DfMbXf+MtS3pz8QW
> 4Vqey5PedooHiLvMvmXK4NyCmfkYKa2kWvp3AgRUGM73tuqvOgf2CU9hovk2MnHh
> RM2LBWDap69ZBhfZvor3igQ9EulRkWrPfkNW2sMOdcdVSTrwgDXhPuPyRND4nMiE
> vdTJtG7ZAHEdTNgUgFYcadNMPgoybKSoZg5Od1YLPqE90QzBzdXLjtn+rjOtyh5J
> pSlHYg2uqiN07z2n3kgKr9TrGbaOwOPQs4v+gonveSagtzUL0xbU5Qu+/jvZn8WZ
> PADL9XrebYiDkc0jkTqu2HPcOBXfcKeoW8+3DiO8TA46VpLA4KDXNYMwn/lqHh3u
> gmIQZbewAnbyNIsHa/CXF2Zk3DTvRdxdIXZGDBca7Yjpp8KzZww=
> =xJRI
> -----END PGP SIGNATURE-----


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: [OT?] Managing text-direction in Java

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Rob,

On 4/10/20 13:26, Rob Tompkins wrote:
> Curious…do you deal with bi-directional text ever (i.e. both
> L-to-R and R-to-L variants in the same text block)? Just looking
> over at the java.awt.font.* package and how it manages this.
> Granted, it doesn’t do so using Locale’s (which generally make more
> sense to me).

No current plans to handle bidi text, but I wouldn't want to
necessarily rule it out if I could avoid it.

> I’m curious how cleanly Locale.script maps onto fonts over there
> because they seem to be using fonts to sort out whether something
> is L-to-R or R-to-L.
Hmm. I won't be able to predict the font being used by the browser, of
course.

I have discovered, unfortunately, that Java's "script" field is often
empty even for languages known to be right-to-left, such as "ar"
(Arabic). So I may have to implement my own list of scripts for a
particular language, or at least maybe a default-script given a
particular language. Or simply stick with the naive implementation I
already have where a single language implies ltr/rtl which isn't
perfect but ... perfect may be difficult to achieve.

Thanks,
- -chris

>> On Apr 10, 2020, at 12:43 PM, Christopher Schultz
<ch...@christopherschultz.net> wrote:
>>
> All,
>
> This is likely off-topic because this isn't really about any
> particular commons-* project, but I know there are Java-text wonks
> who lurk here, so I figured this was a good place to ask. I'm
> happy to relocate this question elsewhere if there's a better
> forum.
>
> I'd like to know what the best practices are for managing
> text-direction in Java.
>
> It will help to give some background about my project's efforts
> thus far .
>
> We have two kinds of text in our web-based project:
>
> 1. Standard UI text, with replaceable parameters and such, in
> properties files. This text is generated by programmers, etc.
>
> 2. "Content" text, stored in an array of database tables. This text
>  is entered by trained users through an administrative interface.
> There are some 3 dozen tables like this in our database.
>
> We are using Struts 1.x, which has great support not only for
> resource bundles, but also for resolving which resource bundle to
> use when the best one isn't found. For example, I can ask for
> Portuguese and, not finding it, actually discover that I've gotten
> English instead. That's not something that e.g. Servlet API
> provides.
>
> So category 1 above is handled, indirectly, through resource
> bundles, and I can see which java.lang.Locale is represented by
> each.
>
> Category 2 is handled by having a pair of fields in each database
> table (storing text records): language and country. Each is a
> CHAR(2) with the country being nullable. All of the code to move
> the text to/from the database was written by our team using
> java.util.Locale as a basis (ignoring the locale "variant").
>
> [NB: I use separate language and country fields because we need to
>  search for records separately by each of those.]
>
> When looking at how to properly-support right-to-left text such as
>  Arabic, Farsi, Urdu, Hebrew, etc., I discovered that using CSS
> this is very easy from a UI perspective and I've developed some
> simple CSS classes which effectively display the proper
> text-direction AND layout which are appropriate... if I can
> determine the proper text-direction.
>
> My first implementation just had a list of languages ([ar, fa, iw,
>  ...]) that were always right-to-left. Looking up
> Locale.getLanguage() in my RTL-language-list works great, and I
> can set a CSS class in my HTML <body> element to get the
> page-layout and text-direction working as desired.
>
> Then I discovered that the above is all wrong: text-directionality
>  isn't determined by the /language/ but by the language /script/,
> of which there can be some that do, indeed, disagree about
> text-direction. So my list of languages doesn't work in all cases.
> I need something else.
>
> It seems that in the case of Category 1 above, this is simple: I
> can simply create a bundle key called text-direction and let the
> bundle decide. This has the advantage of the person writing the
> localized bundle getting to make the decision as to what the
> text-direction ought to be. I think I can call that problem
> "solved" at this point.
>
> With Category 2, our standard operating procedure is to have a call
>  like this:
>
> public FooClass.Text getFooText(Locale locale);
>
> FooClass.Text looks like:
>
> public Locale getLocale(); public String getText();
>
> This allows us to do:
>
> FooClass.Text text = getFooText(Locale.UK);
> System.out.println(text.getLocale());
>
> And have the output be "en_US" if there is text available for en_US
>  but not for en_UK.
>
> But how to handle the directionality?
>
> Looking at java.util.Locale for (hopefully) some direction confuses
>  me. First of all, Locale historically had 3 items: language,
> country, variation. Language and country I already have. As I
> mentioned previously, we've been ignoring variant. Maybe that will
> get me what I need?
>
> Maybe not.
>
> The javadoc for Locale basically says that variant can be anything.
>  The same javaddoc mentions "script" but then says this about the
> 1-, 2-, and 3-argument constructors:
>
> " These constructors allow you to create a Locale object with
> language, country and variant, but you cannot specify script or
> extensions. "
>
> Okay, so .. what do I do about script?
>
> Maybe I've been failing to keep updated when it comes to the
> features of java.util.Locale over the years. In the Java 1.7 (and
> later) javadoc, I discover that java.util.Locale.Builder exists
> and it CAN specify a script. Also, Locale has a "script" member,
> now, so I can:
>
> Locale loc = Locale.Builder()
> .setLanguage(rs.getString("language"),
> .setCountry(rs.getString("country"),
> .setScript("rs.getString("script")) .build();
>
> String script = loc.getScript();
>
> Now I can grab the R-to-L scripts from the Wikipedia page for ISO
> 15924 and say:
>
> if(rtlScripts.contains(loc.getScript()) { // do the right-to-left
> stuff }
>
> This means that I should be able to accomplish what I'm looking to
> do by :
>
> 1. Adding a new column to all text-tables in my database for the
> nullable "script" code.
>
> 2. Update my code that reads / writes those tables to include the
> "script" in its work.
>
> 3. Consult the locale's "script" attribute combined with my list of
>  R-to-L scripts when determining text-directionality.
>
> Does all that sound about right?
>
> Thanks in advance, -chris
>
>>
>> ---------------------------------------------------------------------
>>
>>
>>
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>> For additional commands, e-mail: user-help@commons.apache.org
>>
>
>
> ---------------------------------------------------------------------
>
>
>
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl6d0IoACgkQHPApP6U8
pFjGIQ/+NvLf76QG8QRtwUuckTgkGr+9jTXJSTBXzRmiFWxUUn5jafVUD5EE7ZoB
CkvgSsVUZOB/ByRtBB13ftJqpv05Auy4jkE3EZ5M2E1bfnnFdY8A9DdwKRkiPl5Y
0b1cQBNeQHABcjK0kuMVfCIf7NAicVDJOkBwelNqVW90PcWVhxy1q2fXrtqcWfkF
jTar3mzEXrSpxGU5i9aBXyG8eA7vDuOUUqQzyvERv95ZcC56skbvubNI5GtbvlrI
wH/OsYzF/ZA2a/Vfmr15PuS6PxEIeXx2ElCzMjZjB+TgYwa5DfMbXf+MtS3pz8QW
4Vqey5PedooHiLvMvmXK4NyCmfkYKa2kWvp3AgRUGM73tuqvOgf2CU9hovk2MnHh
RM2LBWDap69ZBhfZvor3igQ9EulRkWrPfkNW2sMOdcdVSTrwgDXhPuPyRND4nMiE
vdTJtG7ZAHEdTNgUgFYcadNMPgoybKSoZg5Od1YLPqE90QzBzdXLjtn+rjOtyh5J
pSlHYg2uqiN07z2n3kgKr9TrGbaOwOPQs4v+gonveSagtzUL0xbU5Qu+/jvZn8WZ
PADL9XrebYiDkc0jkTqu2HPcOBXfcKeoW8+3DiO8TA46VpLA4KDXNYMwn/lqHh3u
gmIQZbewAnbyNIsHa/CXF2Zk3DTvRdxdIXZGDBca7Yjpp8KzZww=
=xJRI
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: [OT?] Managing text-direction in Java

Posted by Rob Tompkins <ch...@gmail.com>.
Curious…do you deal with bi-directional text ever (i.e. both L-to-R and R-to-L variants in the same text block)? Just looking over at the java.awt.font.* package and how it manages this. Granted, it doesn’t do so using Locale’s (which generally make more sense to me).

I’m curious how cleanly Locale.script maps onto fonts over there because they seem to be using fonts to sort out whether something is L-to-R or R-to-L.

-Rob



> On Apr 10, 2020, at 12:43 PM, Christopher Schultz <ch...@christopherschultz.net> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> All,
> 
> This is likely off-topic because this isn't really about any
> particular commons-* project, but I know there are Java-text wonks who
> lurk here, so I figured this was a good place to ask. I'm happy to
> relocate this question elsewhere if there's a better forum.
> 
> I'd like to know what the best practices are for managing
> text-direction in Java.
> 
> It will help to give some background about my project's efforts thus far
> .
> 
> We have two kinds of text in our web-based project:
> 
> 1. Standard UI text, with replaceable parameters and such,
>   in properties files. This text is generated by programmers, etc.
> 
> 2. "Content" text, stored in an array of database tables. This text
>   is entered by trained users through an administrative interface.
>   There are some 3 dozen tables like this in our database.
> 
> We are using Struts 1.x, which has great support not only for resource
> bundles, but also for resolving which resource bundle to use when the
> best one isn't found. For example, I can ask for Portuguese and, not
> finding it, actually discover that I've gotten English instead. That's
> not something that e.g. Servlet API provides.
> 
> So category 1 above is handled, indirectly, through resource bundles,
> and I can see which java.lang.Locale is represented by each.
> 
> Category 2 is handled by having a pair of fields in each database
> table (storing text records): language and country. Each is a CHAR(2)
> with the country being nullable. All of the code to move the text
> to/from the database was written by our team using java.util.Locale as
> a basis (ignoring the locale "variant").
> 
> [NB: I use separate language and country fields because we need to
> search for records separately by each of those.]
> 
> When looking at how to properly-support right-to-left text such as
> Arabic, Farsi, Urdu, Hebrew, etc., I discovered that using CSS this is
> very easy from a UI perspective and I've developed some simple CSS
> classes which effectively display the proper text-direction AND layout
> which are appropriate... if I can determine the proper text-direction.
> 
> My first implementation just had a list of languages ([ar, fa, iw,
> ...]) that were always right-to-left. Looking up Locale.getLanguage()
> in my RTL-language-list works great, and I can set a CSS class in my
> HTML <body> element to get the page-layout and text-direction working
> as desired.
> 
> Then I discovered that the above is all wrong: text-directionality
> isn't determined by the /language/ but by the language /script/, of
> which there can be some that do, indeed, disagree about
> text-direction. So my list of languages doesn't work in all cases. I
> need something else.
> 
> It seems that in the case of Category 1 above, this is simple: I can
> simply create a bundle key called text-direction and let the bundle
> decide. This has the advantage of the person writing the localized
> bundle getting to make the decision as to what the text-direction
> ought to be. I think I can call that problem "solved" at this point.
> 
> With Category 2, our standard operating procedure is to have a call
> like this:
> 
> public FooClass.Text getFooText(Locale locale);
> 
> FooClass.Text looks like:
> 
> public Locale getLocale();
> public String getText();
> 
> This allows us to do:
> 
> FooClass.Text text = getFooText(Locale.UK);
> System.out.println(text.getLocale());
> 
> And have the output be "en_US" if there is text available for en_US
> but not for en_UK.
> 
> But how to handle the directionality?
> 
> Looking at java.util.Locale for (hopefully) some direction confuses
> me. First of all, Locale historically had 3 items: language, country,
> variation. Language and country I already have. As I mentioned
> previously, we've been ignoring variant. Maybe that will get me what I
> need?
> 
> Maybe not.
> 
> The javadoc for Locale basically says that variant can be anything.
> The same javaddoc mentions "script" but then says this about the 1-,
> 2-, and 3-argument constructors:
> 
> "
> These constructors allow you to create a Locale object with language,
> country and variant, but you cannot specify script or extensions.
> "
> 
> Okay, so .. what do I do about script?
> 
> Maybe I've been failing to keep updated when it comes to the features
> of java.util.Locale over the years. In the Java 1.7 (and later)
> javadoc, I discover that java.util.Locale.Builder exists and it CAN
> specify a script. Also, Locale has a "script" member, now, so I can:
> 
> Locale loc = Locale.Builder()
>    .setLanguage(rs.getString("language"),
>    .setCountry(rs.getString("country"),
>    .setScript("rs.getString("script"))
>    .build();
> 
> String script = loc.getScript();
> 
> Now I can grab the R-to-L scripts from the Wikipedia page for ISO
> 15924 and say:
> 
> if(rtlScripts.contains(loc.getScript()) {
>  // do the right-to-left stuff
> }
> 
> This means that I should be able to accomplish what I'm looking to do by
> :
> 
> 1. Adding a new column to all text-tables in my database for the
> nullable "script" code.
> 
> 2. Update my code that reads / writes those tables to include the
> "script" in its work.
> 
> 3. Consult the locale's "script" attribute combined with my list of
> R-to-L scripts when determining text-directionality.
> 
> Does all that sound about right?
> 
> Thanks in advance,
> - -chris
> 
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
> 
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl6QoisACgkQHPApP6U8
> pFgVnw/+JaTQZLr8aJPTXPFNQzfrnBZ7uaa/FKIezVCu2dMZrFEk/Ea00LUl+yCP
> 2pQdPgFQqnaIC+pC9/GEdxgsyXZW3liQDl6orKzzLUEF52wRbJL7eA4PEEmpa5bd
> eqwi5Kv6rTlIm8/J2unNtdA9NQA87Yk2UskVLOmfO3dvs/0UbTe8YBLsqLt9Y01Z
> qprn9109dHlHA/Dyx01ocrJpXtRyoMiqZNxuq0UVWWtTeSPYLHgnDCoTzs9HLqxB
> SDqtld4xH1LZggUBMphXzRYL7YMhv7kBO5Z9SBpX9e+KYpZga8dSkx82WHx6Itv2
> nBkNRsOHNFM+TDqgmBU640VaVm/z9L0Bb1PDe+a0TVQEo/S6T8isNbaDH/yuEgxm
> dL4LtIzVVCVPWBOq6JVW3S0Rdcm1jRjlLJzoCmJpiKA6zWa5O1IIHlFFXrkiVPgi
> Kz0oIl9oPcOyxVrIMNJMBt0qd0wfNvPqkLWyeym1tlcajpM3cpPFG4mDpMo8rzbE
> +4NQi5UyJdEIdg5MINO6v11yjmY1nw/3HgdMlgdMb33MqDN9Yn7GBsNk5WfmNax0
> Fm1vRnps1S5CpJygnrVE7/we0gytEQXoINCPpUJJyZi5YGD1dKUk3QvQ6+iEsasi
> cu5izA28BwLaxgV7B941qYYH3u642fuqYgK8w/O8OhSChAcQ/3c=
> =Yg7z
> -----END PGP SIGNATURE-----
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org