You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ankita Patil <an...@germinait.com> on 2012/01/24 11:36:31 UTC

solr stopwords issue - documents are not matching

Hi,

I am using solr-3.4. My part of the schema looks like :

<fieldType name="text" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="true">

    <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_en.txt" enablePositionIncrements="true"/>

        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>

        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>

    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>

        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_en.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>

        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>

    </analyzer>
</fieldType>

stopwords_en.txt contains :
a
an
and
are
as

etc..

Now when I search for "*buy house*" Solr does not return me the documents
with text "*buy a house*".
Also when I search for "*buy a house*" Solr does not return me the
documents with text "*buy house*".

A part of debugQuery is
<str name="rawquerystring">cContent:"buy a house"</str>
<str name="querystring">cContent:"buy a house"</str>
<str name="parsedquery">PhraseQuery(cContent:"bui ? hous")</str>
<str name="parsedquery_toString">cContent:"bui ? hous"</str>

Any idea how can I solve this problem? or what is wrong?

Thanks
Ankita