You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by alexander sulz <a....@digiconcept.net> on 2010/10/05 21:25:20 UTC

Umlaut in facet name attribute

  Good Evening and Morning.

I noticed that if I do a facet search on a field which value contains 
umlaute (öäü),
the facet list returned converted the value of the field into a normal 
character (oau)..

How do I precent this from happening?

I cant seem to find the configuration for faceting in theschema or 
config xml files.

thx
  alex

Re: Re: Umlaut in facet name attribute

Posted by Lance Norskog <go...@gmail.com>.
Faceting on analyzed text can eat a lot of RAM. This strategy might not scale.

On Tue, Oct 5, 2010 at 4:00 PM, Savvas-Andreas Moysidis
<sa...@googlemail.com> wrote:
> Good point,
>
> so you could have an unanalyzed counterpart field set with a <copyfield />
> and facet on that..
>
> On 5 October 2010 23:49, Markus Jelsma <ma...@buyways.nl> wrote:
>
>> It is a good practice (for many cases as seen on the list) to search
>> (usually with fq) on analzyed fields but return the facet list based on the
>> unanalyzed counterparts.
>>
>> -----Original message-----
>> From: Savvas-Andreas Moysidis <sa...@googlemail.com>
>> Sent: Wed 06-10-2010 00:46
>> To: solr-user@lucene.apache.org;
>> Subject: Re: Umlaut in facet name attribute
>>
>> Hello,
>>
>> It seems that your analysis process removes punctuation and therefore
>> indexes terms without it. What you see in the faceted result is the text
>> that has been indexed.
>>
>> If you select a Tokenizer/Token Filter which preserves punctuation you
>> should be able to see what you want.
>>
>> Cheers,
>> -- Savvas
>>
>> On 5 October 2010 20:25, alexander sulz <a....@digiconcept.net> wrote:
>>
>> >  Good Evening and Morning.
>> >
>> > I noticed that if I do a facet search on a field which value contains
>> > umlaute (öäü),
>> > the facet list returned converted the value of the field into a normal
>> > character (oau)..
>> >
>> > How do I precent this from happening?
>> >
>> > I cant seem to find the configuration for faceting in theschema or config
>> > xml files.
>> >
>> > thx
>> >  alex
>> >
>>
>



-- 
Lance Norskog
goksron@gmail.com

Re: Re: Umlaut in facet name attribute

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
Good point,

so you could have an unanalyzed counterpart field set with a <copyfield />
and facet on that..

On 5 October 2010 23:49, Markus Jelsma <ma...@buyways.nl> wrote:

> It is a good practice (for many cases as seen on the list) to search
> (usually with fq) on analzyed fields but return the facet list based on the
> unanalyzed counterparts.
>
> -----Original message-----
> From: Savvas-Andreas Moysidis <sa...@googlemail.com>
> Sent: Wed 06-10-2010 00:46
> To: solr-user@lucene.apache.org;
> Subject: Re: Umlaut in facet name attribute
>
> Hello,
>
> It seems that your analysis process removes punctuation and therefore
> indexes terms without it. What you see in the faceted result is the text
> that has been indexed.
>
> If you select a Tokenizer/Token Filter which preserves punctuation you
> should be able to see what you want.
>
> Cheers,
> -- Savvas
>
> On 5 October 2010 20:25, alexander sulz <a....@digiconcept.net> wrote:
>
> >  Good Evening and Morning.
> >
> > I noticed that if I do a facet search on a field which value contains
> > umlaute (öäü),
> > the facet list returned converted the value of the field into a normal
> > character (oau)..
> >
> > How do I precent this from happening?
> >
> > I cant seem to find the configuration for faceting in theschema or config
> > xml files.
> >
> > thx
> >  alex
> >
>

RE: Re: Umlaut in facet name attribute

Posted by Markus Jelsma <ma...@buyways.nl>.
It is a good practice (for many cases as seen on the list) to search (usually with fq) on analzyed fields but return the facet list based on the unanalyzed counterparts.
 
-----Original message-----
From: Savvas-Andreas Moysidis <sa...@googlemail.com>
Sent: Wed 06-10-2010 00:46
To: solr-user@lucene.apache.org; 
Subject: Re: Umlaut in facet name attribute

Hello,

It seems that your analysis process removes punctuation and therefore
indexes terms without it. What you see in the faceted result is the text
that has been indexed.

If you select a Tokenizer/Token Filter which preserves punctuation you
should be able to see what you want.

Cheers,
-- Savvas

On 5 October 2010 20:25, alexander sulz <a....@digiconcept.net> wrote:

>  Good Evening and Morning.
>
> I noticed that if I do a facet search on a field which value contains
> umlaute (öäü),
> the facet list returned converted the value of the field into a normal
> character (oau)..
>
> How do I precent this from happening?
>
> I cant seem to find the configuration for faceting in theschema or config
> xml files.
>
> thx
>  alex
>

Re: Umlaut in facet name attribute

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
Hello,

It seems that your analysis process removes punctuation and therefore
indexes terms without it. What you see in the faceted result is the text
that has been indexed.

If you select a Tokenizer/Token Filter which preserves punctuation you
should be able to see what you want.

Cheers,
-- Savvas

On 5 October 2010 20:25, alexander sulz <a....@digiconcept.net> wrote:

>  Good Evening and Morning.
>
> I noticed that if I do a facet search on a field which value contains
> umlaute (öäü),
> the facet list returned converted the value of the field into a normal
> character (oau)..
>
> How do I precent this from happening?
>
> I cant seem to find the configuration for faceting in theschema or config
> xml files.
>
> thx
>  alex
>