You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by tom_ <to...@web.de> on 2008/11/29 16:37:59 UTC

ORDER BY and greek characters

I'm using ORDER BY and a database with collation territory based de_DE.
Strings starting with greek characters are at the end of the resultset.
Users would expect them near the respective latin character, e.g. greek
"alpha" near "a". Is there a possiblity to sort in this way?
  
-- 
View this message in context: http://www.nabble.com/ORDER-BY-and-greek-characters-tp20748193p20748193.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.


Re: ORDER BY and greek characters

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.
Rick Hillegas <Ri...@Sun.COM> writes:

> tom_ wrote:
>> I'm using ORDER BY and a database with collation territory based de_DE.
>> Strings starting with greek characters are at the end of the resultset.
>> Users would expect them near the respective latin character, e.g. greek
>> "alpha" near "a". Is there a possiblity to sort in this way?
>>     
> Hi Tom,
>
> Here a couple suggestions:
[...]
> 2) You could change Derby so that it supports user-defined Locales
> with custom collations. I haven't looked into the details here, but
> this seems like a smallish task--gated by a community discussion about
> how to declare these user-created Locales.
>
> + This would be generally useful.
> + Indexes would work for you and give your queries better performance.
> - This solution wouldn't hit production until Derby 10.5 at the earliest.

This can in fact be done with no changes to the Derby code, since Java
SE 6 has an API for defining custom locales and collation rules. I have
given an example of how you can use this to achieve what you want here:
http://blogs.sun.com/kah/entry/user_defined_collation_in_apache

Hope this helps,

-- 
Knut Anders

Re: ORDER BY and greek characters

Posted by tom_ <to...@web.de>.
Hi Rick,

thank you for the suggestions, I have to examine them and will tell you if
there are further problems.

Tom


Rick Hillegas-2 wrote:
> 
> tom_ wrote:
>> I'm using ORDER BY and a database with collation territory based de_DE.
>> Strings starting with greek characters are at the end of the resultset.
>> Users would expect them near the respective latin character, e.g. greek
>> "alpha" near "a". Is there a possiblity to sort in this way?
>>   
>>   
> Hi Tom,
> 
> Here a couple suggestions:
> 
> 1) You can write your own public static method which takes a String 
> argument and returns a byte array, where the byte array sorts in the 
> order that you want--note that Derby sorts byte arrays 
> lexicographically, just like Strings. The existing support for 
> CollationKey should help you write this method. Your method's signature 
> would look like this:
> 
>   public static byte[] myOrder( String raw ) throws SQLException { ... }
> 
> Then you would register your method as a Derby function:
> 
> create function myOrder
> (
>     raw varchar( 100 )
> )
> returns varchar( 1000 ) for bit data
> language java
> parameter style java
> no sql
> external name 'MyClass.myOrder'
> 
> Finally, you would use this function in queries like this:
> 
> select * from t order by myOrder( a )
> 
> + The good thing about this solution is that it should work today on 10.4.
> - The downside is that you don't have indexing support for these queries.
> 
> 
> 2) You could change Derby so that it supports user-defined Locales with 
> custom collations. I haven't looked into the details here, but this 
> seems like a smallish task--gated by a community discussion about how to 
> declare these user-created Locales.
> 
> + This would be generally useful.
> + Indexes would work for you and give your queries better performance.
> - This solution wouldn't hit production until Derby 10.5 at the earliest.
> 
> 
> 3) You could build on solution (1) and use generated columns. As in (1), 
> you would write the myOrder function. Your would register this method as 
> a deterministic function:
> 
> create function myOrder
> (
>     raw varchar( 100 )
> )
> returns varchar( 1000 ) for bit data
> language java
> deterministic
> parameter style java
> no sql
> external name 'z.myOrder'
> 
> Then you would declare and index your table like this:
> 
> create table t( a varchar( 100 ), b generated always as ( myOrder( a ) ) )
> create index t_idx on t( b )
> 
> Finally, you would query your table like so:
> 
> select a from t order by b
> 
> + This would give you indexing support as in (2).
> - Generated columns won't be available until the next feature release,
> 10.5
> 
> 
> Let me know if you have more questions about these solutions.
> 
> 
> Hope this helps,
> -Rick
> 
> 
> 
> 
> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/ORDER-BY-and-greek-characters-tp20748193p20774503.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.


Re: ORDER BY and greek characters

Posted by Rick Hillegas <Ri...@Sun.COM>.
tom_ wrote:
> I'm using ORDER BY and a database with collation territory based de_DE.
> Strings starting with greek characters are at the end of the resultset.
> Users would expect them near the respective latin character, e.g. greek
> "alpha" near "a". Is there a possiblity to sort in this way?
>   
>   
Hi Tom,

Here a couple suggestions:

1) You can write your own public static method which takes a String 
argument and returns a byte array, where the byte array sorts in the 
order that you want--note that Derby sorts byte arrays 
lexicographically, just like Strings. The existing support for 
CollationKey should help you write this method. Your method's signature 
would look like this:

  public static byte[] myOrder( String raw ) throws SQLException { ... }

Then you would register your method as a Derby function:

create function myOrder
(
    raw varchar( 100 )
)
returns varchar( 1000 ) for bit data
language java
parameter style java
no sql
external name 'MyClass.myOrder'

Finally, you would use this function in queries like this:

select * from t order by myOrder( a )

+ The good thing about this solution is that it should work today on 10.4.
- The downside is that you don't have indexing support for these queries.


2) You could change Derby so that it supports user-defined Locales with 
custom collations. I haven't looked into the details here, but this 
seems like a smallish task--gated by a community discussion about how to 
declare these user-created Locales.

+ This would be generally useful.
+ Indexes would work for you and give your queries better performance.
- This solution wouldn't hit production until Derby 10.5 at the earliest.


3) You could build on solution (1) and use generated columns. As in (1), 
you would write the myOrder function. Your would register this method as 
a deterministic function:

create function myOrder
(
    raw varchar( 100 )
)
returns varchar( 1000 ) for bit data
language java
deterministic
parameter style java
no sql
external name 'z.myOrder'

Then you would declare and index your table like this:

create table t( a varchar( 100 ), b generated always as ( myOrder( a ) ) )
create index t_idx on t( b )

Finally, you would query your table like so:

select a from t order by b

+ This would give you indexing support as in (2).
- Generated columns won't be available until the next feature release, 10.5


Let me know if you have more questions about these solutions.


Hope this helps,
-Rick