You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Chantal Ackermann <ch...@btelligent.de> on 2009/10/01 11:51:21 UTC

Where to place ReversedWildcardFilterFactory in Chain

Hi all,

I would have two questions about the ReversedWildcardFilterFactory:
a) put it into both chains, index and query, or into index only?
b) where exactly in the/each chain do I have to put it? (Do I have to 
respect a certain order - as I have wordDelimiter and lowercase in 
there, as well.)

More Details:

I understand it is used to allow queries like "*sport".

My current configuration for the field I want to use it for contains 
this setup:

<fieldType name="text_cn" class="solr.TextField">
   <analyzer>
     <filter class="solr.WordDelimiterFilterFactory"
        splitOnCaseChange="1" splitOnNumerics="1"
        stemEnglishPossessive="1" generateWordParts="1"
        generateNumberParts="1" catenateAll="1"
        preserveOriginal="1" />
     <filter class="solr.LowerCaseFilterFactory" />
   </analyzer>
</fieldType>

The wiki page 
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters states for 
the ReversedWildcardFF:
"Add this filter to the index analyzer, but not the query analyzer."

However, the API for it says it provides functionality at index and 
query time (my understanding):
"When this factory is added to an analysis chain, it will be used both 
for filtering the tokens during indexing, and to determine the query 
processing of this field during search."

Any help is greatly appreciated.
Thanks!
Chantal



-- 
Chantal Ackermann

Re: Where to place ReversedWildcardFilterFactory in Chain

Posted by Chantal Ackermann <ch...@btelligent.de>.

Sorry! I didn't replace the war file correctly. It was still the one 
from start of August.



Chantal Ackermann schrieb:
> Hi Mark,
> 
> the READMA.txt in the main directory contains:
> $Id: CHANGES.txt 817424 2009-09-21 21:53:41Z yonik $
> 
> I've downloaded the package as artifact from the Hudson server.
> 
> Chantal
> 
> Mark Miller schrieb:
>> It was added to trunk on the 11th and shouldn't require a patch. You
>> sure that nightly was actually build after then?
>>
>> solr.ReversedWildcardFilterFactory should work fine.

Re: Where to place ReversedWildcardFilterFactory in Chain

Posted by Chantal Ackermann <ch...@btelligent.de>.

Hi Mark,

the READMA.txt in the main directory contains:
$Id: CHANGES.txt 817424 2009-09-21 21:53:41Z yonik $

I've downloaded the package as artifact from the Hudson server.

Chantal

Mark Miller schrieb:
> It was added to trunk on the 11th and shouldn't require a patch. You
> sure that nightly was actually build after then?
> 
> solr.ReversedWildcardFilterFactory should work fine.
> 
> Chantal Ackermann wrote:
>> Hi Andrzej,
>>
>> thanks! Unfortunately, I get a ClassNotFoundException for the
>> solr.ReversedWildcardFilterFactory with my nightly build from 22nd of
>> September. I've found the corresponding JIRA issue, but from the wiki
>> it's not obvious that this might require a patch? I'll have a closer
>> look at the JIRA issue, in any case.
>>
>> Cheers,
>> Chantal
>>
>>
>> Andrzej Bialecki schrieb:
>>> Chantal Ackermann wrote:
>>>> Thanks, Mark!
>>>> But I suppose it does matter where in the index chain it goes? I would
>>>> guess it is applied to the tokens, so I suppose I should put it at the
>>>> very end - after WordDelimiter and Lowercase have been applied.
>>>>
>>>>
>>>> Is that correct?
>>>>
>>>>  >>   <analyzer type="index">
>>>>  >>     <filter class="solr.WordDelimiterFilterFactory"
>>>>  >>        splitOnCaseChange="1" splitOnNumerics="1"
>>>>  >>        stemEnglishPossessive="1" generateWordParts="1"
>>>>  >>        generateNumberParts="1" catenateAll="1"
>>>>  >>        preserveOriginal="1" />
>>>>  >>     <filter class="solr.LowerCaseFilterFactory" />
>>>>        <filter class="solr.ReversedWildcardFilterFactory" />
>>>>  >>   </analyzer>
>>> Yes. Care should be taken that the query analyzer chain produces the
>>> same forward tokens, because the code in QueryParser that optionally
>>> reverses tokens acts on tokens that it receives _after_ all other query
>>> analyzers have run on the query.
>>>
>>>
>>> --
>>> Best regards,
>>> Andrzej Bialecki     <><
>>>   ___. ___ ___ ___ _ _   __________________________________
>>> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>>> ___|||__||  \|  ||  |  Embedded Unix, System Integration
>>> http://www.sigram.com  Contact: info at sigram dot com
>>>
> 
> 
> --
> - Mark
> 
> http://www.lucidimagination.com
> 
> 
>

Re: Where to place ReversedWildcardFilterFactory in Chain

Posted by Mark Miller <ma...@gmail.com>.

It was added to trunk on the 11th and shouldn't require a patch. You
sure that nightly was actually build after then?

solr.ReversedWildcardFilterFactory should work fine.

Chantal Ackermann wrote:
> Hi Andrzej,
>
> thanks! Unfortunately, I get a ClassNotFoundException for the
> solr.ReversedWildcardFilterFactory with my nightly build from 22nd of
> September. I've found the corresponding JIRA issue, but from the wiki
> it's not obvious that this might require a patch? I'll have a closer
> look at the JIRA issue, in any case.
>
> Cheers,
> Chantal
>
>
> Andrzej Bialecki schrieb:
>> Chantal Ackermann wrote:
>>> Thanks, Mark!
>>> But I suppose it does matter where in the index chain it goes? I would
>>> guess it is applied to the tokens, so I suppose I should put it at the
>>> very end - after WordDelimiter and Lowercase have been applied.
>>>
>>>
>>> Is that correct?
>>>
>>>  >>   <analyzer type="index">
>>>  >>     <filter class="solr.WordDelimiterFilterFactory"
>>>  >>        splitOnCaseChange="1" splitOnNumerics="1"
>>>  >>        stemEnglishPossessive="1" generateWordParts="1"
>>>  >>        generateNumberParts="1" catenateAll="1"
>>>  >>        preserveOriginal="1" />
>>>  >>     <filter class="solr.LowerCaseFilterFactory" />
>>>        <filter class="solr.ReversedWildcardFilterFactory" />
>>>  >>   </analyzer>
>>
>> Yes. Care should be taken that the query analyzer chain produces the
>> same forward tokens, because the code in QueryParser that optionally
>> reverses tokens acts on tokens that it receives _after_ all other query
>> analyzers have run on the query.
>>
>>
>> -- 
>> Best regards,
>> Andrzej Bialecki     <><
>>   ___. ___ ___ ___ _ _   __________________________________
>> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>> ___|||__||  \|  ||  |  Embedded Unix, System Integration
>> http://www.sigram.com  Contact: info at sigram dot com
>>


-- 
- Mark

http://www.lucidimagination.com

Re: Where to place ReversedWildcardFilterFactory in Chain

Posted by Chantal Ackermann <ch...@btelligent.de>.

Hi Andrzej,

thanks! Unfortunately, I get a ClassNotFoundException for the 
solr.ReversedWildcardFilterFactory with my nightly build from 22nd of 
September. I've found the corresponding JIRA issue, but from the wiki 
it's not obvious that this might require a patch? I'll have a closer 
look at the JIRA issue, in any case.

Cheers,
Chantal


Andrzej Bialecki schrieb:
> Chantal Ackermann wrote:
>> Thanks, Mark!
>> But I suppose it does matter where in the index chain it goes? I would
>> guess it is applied to the tokens, so I suppose I should put it at the
>> very end - after WordDelimiter and Lowercase have been applied.
>>
>>
>> Is that correct?
>>
>>  >>   <analyzer type="index">
>>  >>     <filter class="solr.WordDelimiterFilterFactory"
>>  >>        splitOnCaseChange="1" splitOnNumerics="1"
>>  >>        stemEnglishPossessive="1" generateWordParts="1"
>>  >>        generateNumberParts="1" catenateAll="1"
>>  >>        preserveOriginal="1" />
>>  >>     <filter class="solr.LowerCaseFilterFactory" />
>>        <filter class="solr.ReversedWildcardFilterFactory" />
>>  >>   </analyzer>
> 
> Yes. Care should be taken that the query analyzer chain produces the
> same forward tokens, because the code in QueryParser that optionally
> reverses tokens acts on tokens that it receives _after_ all other query
> analyzers have run on the query.
> 
> 
> --
> Best regards,
> Andrzej Bialecki     <><
>   ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>

Re: Where to place ReversedWildcardFilterFactory in Chain

Posted by Andrzej Bialecki <ab...@getopt.org>.

Chantal Ackermann wrote:
> Thanks, Mark!
> But I suppose it does matter where in the index chain it goes? I would 
> guess it is applied to the tokens, so I suppose I should put it at the 
> very end - after WordDelimiter and Lowercase have been applied.
> 
> 
> Is that correct?
> 
>  >>   <analyzer type="index">
>  >>     <filter class="solr.WordDelimiterFilterFactory"
>  >>        splitOnCaseChange="1" splitOnNumerics="1"
>  >>        stemEnglishPossessive="1" generateWordParts="1"
>  >>        generateNumberParts="1" catenateAll="1"
>  >>        preserveOriginal="1" />
>  >>     <filter class="solr.LowerCaseFilterFactory" />
>        <filter class="solr.ReversedWildcardFilterFactory" />
>  >>   </analyzer>

Yes. Care should be taken that the query analyzer chain produces the 
same forward tokens, because the code in QueryParser that optionally 
reverses tokens acts on tokens that it receives _after_ all other query 
analyzers have run on the query.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Where to place ReversedWildcardFilterFactory in Chain

Posted by Chantal Ackermann <ch...@btelligent.de>.

Thanks, Mark!
But I suppose it does matter where in the index chain it goes? I would 
guess it is applied to the tokens, so I suppose I should put it at the 
very end - after WordDelimiter and Lowercase have been applied.


Is that correct?

 >>   <analyzer type="index">
 >>     <filter class="solr.WordDelimiterFilterFactory"
 >>        splitOnCaseChange="1" splitOnNumerics="1"
 >>        stemEnglishPossessive="1" generateWordParts="1"
 >>        generateNumberParts="1" catenateAll="1"
 >>        preserveOriginal="1" />
 >>     <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.ReversedWildcardFilterFactory" />
 >>   </analyzer>


Cheers,
Chantal

Mark Miller schrieb:
 > You just put it in the index chain, not the query chain. The
 > SolrQueryParser will consult it when building a wildcard search - don't
 > put it in the query chain. I know, appears like a bit of magic. That
 > Andrzej is a wizard though, so it makes sense ;)
 >
 > --
 > - Mark
 >
 > http://www.lucidimagination.com
 >
 >
 >
> Chantal Ackermann wrote:
>> Hi all,
>>
>> I would have two questions about the ReversedWildcardFilterFactory:
>> a) put it into both chains, index and query, or into index only?
>> b) where exactly in the/each chain do I have to put it? (Do I have to
>> respect a certain order - as I have wordDelimiter and lowercase in
>> there, as well.)
>>
>> More Details:
>>
>> I understand it is used to allow queries like "*sport".
>>
>> My current configuration for the field I want to use it for contains
>> this setup:
>>
>> <fieldType name="text_cn" class="solr.TextField">
>>   <analyzer>
>>     <filter class="solr.WordDelimiterFilterFactory"
>>        splitOnCaseChange="1" splitOnNumerics="1"
>>        stemEnglishPossessive="1" generateWordParts="1"
>>        generateNumberParts="1" catenateAll="1"
>>        preserveOriginal="1" />
>>     <filter class="solr.LowerCaseFilterFactory" />
>>   </analyzer>
>> </fieldType>
>>
>> The wiki page
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters states for
>> the ReversedWildcardFF:
>> "Add this filter to the index analyzer, but not the query analyzer."
>>
>> However, the API for it says it provides functionality at index and
>> query time (my understanding):
>> "When this factory is added to an analysis chain, it will be used both
>> for filtering the tokens during indexing, and to determine the query
>> processing of this field during search."
>>
>> Any help is greatly appreciated.
>> Thanks!
>> Chantal
>>
>>
>>

Re: Where to place ReversedWildcardFilterFactory in Chain

Posted by Mark Miller <ma...@gmail.com>.

Chantal Ackermann wrote:
> Hi all,
>
> I would have two questions about the ReversedWildcardFilterFactory:
> a) put it into both chains, index and query, or into index only?
> b) where exactly in the/each chain do I have to put it? (Do I have to
> respect a certain order - as I have wordDelimiter and lowercase in
> there, as well.)
>
> More Details:
>
> I understand it is used to allow queries like "*sport".
>
> My current configuration for the field I want to use it for contains
> this setup:
>
> <fieldType name="text_cn" class="solr.TextField">
>   <analyzer>
>     <filter class="solr.WordDelimiterFilterFactory"
>        splitOnCaseChange="1" splitOnNumerics="1"
>        stemEnglishPossessive="1" generateWordParts="1"
>        generateNumberParts="1" catenateAll="1"
>        preserveOriginal="1" />
>     <filter class="solr.LowerCaseFilterFactory" />
>   </analyzer>
> </fieldType>
>
> The wiki page
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters states for
> the ReversedWildcardFF:
> "Add this filter to the index analyzer, but not the query analyzer."
>
> However, the API for it says it provides functionality at index and
> query time (my understanding):
> "When this factory is added to an analysis chain, it will be used both
> for filtering the tokens during indexing, and to determine the query
> processing of this field during search."
>
> Any help is greatly appreciated.
> Thanks!
> Chantal
>
>
>
You just put it in the index chain, not the query chain. The
SolrQueryParser will consult it when building a wildcard search - don't
put it in the query chain. I know, appears like a bit of magic. That
Andrzej is a wizard though, so it makes sense ;)

-- 
- Mark

http://www.lucidimagination.com