You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "damian.pawski" <dp...@gmail.com> on 2018/05/31 15:04:24 UTC
Solr 7, exact phrase search, empty results for some records
Hi,
I have updated Solr from 5.4.1 to 7.2.1.
I have updated the settings accordingly, but in some cases when I am
searching for an exact phrase surrounded by quotes I am getting 0 results.
In 5.4.1 I have
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"
preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
In 7.2.1 I have
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
<filter class="solr.WordDelimiterGraphFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="0" splitOnCaseChange="1" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.FlattenGraphFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
I couldn't find any pattern explaining, why for some records searches with
quotes work fine but for the others, 0 results are returned (I have checked
and the records that are missing are imported, as I can find
them by the Id).
Could you point me to correct direction in terms how can I investigate this?
I have checked the results of the "..analysis..." pages on both instances of
Solr for the problematic records and in both cases I am getting the same
outcome.
Thank you
Damian
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr 7, exact phrase search, empty results for some records
Posted by "damian.pawski" <dp...@gmail.com>.
Thank you for a quick response,
I have moved the
/<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/> /
from /<analyzer type="index">/ to /<analyzer type="query">/ section and it
is working fine.
Once again
Thank you
Damian
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr 7, exact phrase search, empty results for some records
Posted by Erick Erickson <er...@gmail.com>.
The analysis page has one major thing to be aware of: It sees what
would be in the field _after_ query parsing. I applaud your use of it,
it's where lots of problems are found ;).
Try adding &debug=query in the two cases. Particularly look at the
parsedquery_tostring in the response and compare.
And I don't _think_ this is the issue since you're specifying phrases,
but split-on-whitespace default has changed, see:
https://lucidworks.com/2017/04/18/multi-word-synonyms-solr-adds-query-time-support/
Good luck,
Erick
On Thu, May 31, 2018 at 8:04 AM, damian.pawski <dp...@gmail.com> wrote:
> Hi,
>
> I have updated Solr from 5.4.1 to 7.2.1.
>
> I have updated the settings accordingly, but in some cases when I am
> searching for an exact phrase surrounded by quotes I am getting 0 results.
>
> In 5.4.1 I have
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
>
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" />
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"
> preserveOriginal="1"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.EnglishPossessiveFilterFactory"/>
> <filter class="solr.ASCIIFoldingFilterFactory"/>
>
> <filter class="solr.EnglishMinimalStemFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.EnglishPossessiveFilterFactory"/>
> <filter class="solr.ASCIIFoldingFilterFactory"/>
>
> <filter class="solr.EnglishMinimalStemFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> </fieldType>
>
> In 7.2.1 I have
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" />
>
> <filter class="solr.WordDelimiterGraphFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="0" splitOnCaseChange="1" preserveOriginal="1"/>
>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.EnglishPossessiveFilterFactory"/>
> <filter class="solr.ASCIIFoldingFilterFactory"/>
>
> <filter class="solr.EnglishMinimalStemFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> <filter class="solr.FlattenGraphFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
> <filter class="solr.WordDelimiterGraphFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.EnglishPossessiveFilterFactory"/>
> <filter class="solr.ASCIIFoldingFilterFactory"/>
>
> <filter class="solr.EnglishMinimalStemFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> </fieldType>
>
> I couldn't find any pattern explaining, why for some records searches with
> quotes work fine but for the others, 0 results are returned (I have checked
> and the records that are missing are imported, as I can find
> them by the Id).
>
> Could you point me to correct direction in terms how can I investigate this?
>
> I have checked the results of the "..analysis..." pages on both instances of
> Solr for the problematic records and in both cases I am getting the same
> outcome.
>
> Thank you
> Damian
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html