You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Bernd Fehling <be...@uni-bielefeld.de> on 2014/10/23 16:31:22 UTC

QueryAutoStopWordAnalyzer

I just located the QueryAutoStopWordAnalyzer in lucene.
Has anyone managed to use it for solr?

Could imagine to have a language independent search "clean up"
for the text_all field.

Can it be used for solr right out of the box or do I have to
write a wrapper or factory?

Regards
Bernd

Re: QueryAutoStopWordAnalyzer

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
How is this different from using StopFilterFactory in Solr:
http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/core/StopFilterFactory.html
?

Lucene "wraps" analyzers, Solr has a chain instead (though analyzers
are supported as well).

You just configure the chain. Writing a factory for when one analyzer
wraps another would be just duplication of the chain code.

What am I missing?

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 23 October 2014 10:31, Bernd Fehling <be...@uni-bielefeld.de> wrote:
> I just located the QueryAutoStopWordAnalyzer in lucene.
> Has anyone managed to use it for solr?
>
> Could imagine to have a language independent search "clean up"
> for the text_all field.
>
> Can it be used for solr right out of the box or do I have to
> write a wrapper or factory?
>
> Regards
> Bernd

Re: QueryAutoStopWordAnalyzer

Posted by Bernd Fehling <be...@uni-bielefeld.de>.

Am 23.10.2014 um 18:03 schrieb Alexandre Rafalovitch:
> How is this different from using StopFilterFactory in Solr:
> http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/core/StopFilterFactory.html
> ?

With StopFilterFactory you have to set up a file with stopwords
and maintain that file.

With QueryAutoStopWordAnalyzer the docs say "An Analyzer used primarily
at query time to wrap another analyzer and provide a layer of protection
which prevents very common words from being passed into queries."
And what I'm looking for "... QueryAutoStopWordAnalyzer with stopwords
calculated for the given selection of fields from terms with a document
frequency greater than the given maxDocFreq", which is by default set to 0.400...
but can probably be adjusted to a "value of personal taste".
So you don't have to setup and maintain a stopword.txt file.

I might be wrong, but does this sound different to StopFilterFactory?

Regards
Bernd

> 
> Lucene "wraps" analyzers, Solr has a chain instead (though analyzers
> are supported as well).
> 
> You just configure the chain. Writing a factory for when one analyzer
> wraps another would be just duplication of the chain code.
> 
> What am I missing?
> 
> Regards,
>    Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
> 
> 
> On 23 October 2014 10:31, Bernd Fehling <be...@uni-bielefeld.de> wrote:
>> I just located the QueryAutoStopWordAnalyzer in lucene.
>> Has anyone managed to use it for solr?
>>
>> Could imagine to have a language independent search "clean up"
>> for the text_all field.
>>
>> Can it be used for solr right out of the box or do I have to
>> write a wrapper or factory?
>>
>> Regards
>> Bernd