You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sandro Zbinden <zb...@imagic.ch> on 2013/09/09 13:16:03 UTC

Facet Sort with non ASCII Characters

Dear solr users

Is there a plan to add support for alphabetical facet sorting with non ASCII Characters ?

Best regards Sandro


________________________________
Sandro Zbinden
Software Engineer




AW: Facet Sort with non ASCII Characters

Posted by Sandro Zbinden <zb...@imagic.ch>.
Hey Yonik

It installed the latest Solr (Solr 4.4) and started the jetty configured in the example directory. 

To the core collection1 I added three titles. a, b, ä

curl http://localhost:8983/solr/update/json -H 'Content-type:application/json' -d '[{"id" : "1", "title" : "a"},{"id" : "2", "title" : "ä"},{"id" : "3", "title" : "b"}]'

Now I want to sort these three titles with the following query: 

http://localhost:8983/solr/collection1/select?q=*:*&facet=true&facet.sort=index&facet.field=title&rows=0

I expect:

<lst name="title">
<int name="a">1</int>
<int name="ä">1</int>
<int name="b">1</int>
</lst>

But I receive 

<lst name="title">
<int name="a">1</int>
<int name="b">1</int>
<int name="ä">1</int>
</lst>

PS: In Java I would sort these value with a Comperator that uses Collator.getInstance().compare(value1, value2);

Best regards 

Sandro

-----Ursprüngliche Nachricht-----
Von: yseeley@gmail.com [mailto:yseeley@gmail.com] Im Auftrag von Yonik Seeley
Gesendet: Montag, 9. September 2013 21:26
An: solr-user@lucene.apache.org
Betreff: Re: Facet Sort with non ASCII Characters

On Mon, Sep 9, 2013 at 7:16 AM, Sandro Zbinden <zb...@imagic.ch> wrote:
> Is there a plan to add support for alphabetical facet sorting with non ASCII Characters ?

The entire unicode range should already work.  Can you give an example of what you would like to see?

-Yonik
http://lucidworks.com

Re: Facet Sort with non ASCII Characters

Posted by Yonik Seeley <yo...@lucidworks.com>.
On Mon, Sep 9, 2013 at 7:16 AM, Sandro Zbinden <zb...@imagic.ch> wrote:
> Is there a plan to add support for alphabetical facet sorting with non ASCII Characters ?

The entire unicode range should already work.  Can you give an example
of what you would like to see?

-Yonik
http://lucidworks.com

Re: Facet Sort with non ASCII Characters

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Mon, 2013-09-09 at 13:16 +0200, Sandro Zbinden wrote:
> Is there a plan to add support for alphabetical facet sorting with non
> ASCII Characters ?

Not to my knowledge. I discussed an idea a year ago about handling it
with modified ICUCollatorKeys, but that solution does not work well with
the way Solr's current analysis-chain for field content works.

See the thread at
http://lucene.472066.n3.nabble.com/Collator-based-facet-sorting-in-Solr-td4006934.html


We do Collator-based sorting of facet values locally with our custom
facet implementation, but it does the sorting after index-open instead
of using CollatorKeys and thus has quite a startup-time penalty.

- Toke Eskildsen, State and University Library, Denmark