You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Rick Leir <rl...@leirtech.com> on 2018/01/25 22:13:48 UTC

pf2


Hi all
My pf2 keywords^11.0 works for english not for french. Here are the fieldtypes, actually from two schema.xml's in separate cores. Solr 5.2.2, edismax, q.op AND
I suspect there are several problems with the french schema. Maybe I only needed to show the query analyzer, not the index analyzer?

The pf2 does not show a match in the debugQuery=true output for the French. However, a qf keywords^10.0 does show a match. The keywords field is copyfielded into text, which is the df. Is there any other field I should be showing?
Thanks
Rick

<fieldType class="solr.TextField" name="text_en" positionIncrementGap="100">
<analyzer type="index">
   <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
   <tokenizer class="solr.ClassicTokenizerFactory"/>
   <filter class="solr.SynonymFilterFactory" expand="false" ignoreCase="true" synonyms="synonyms.txt" tokenizerFactory="solr.StandardTokenizerFactory"/>
   <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
   <filter class="solr.LowerCaseFilterFactory"/>
   <filter class="solr.EnglishPossessiveFilterFactory"/>
   <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
   <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
   <filter class="solr.EnglishMinimalStemFilterFactory"/>
   <filter class="solr.SnowballPorterFilterFactory" language="English" />
   <filter class="solr.ASCIIFoldingFilterFactory"/>
</analyzer>
<analyzer type="query">
   <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
  <tokenizer class="solr.ClassicTokenizerFactory"/>
   <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
   <filter class="solr.LowerCaseFilterFactory"/>
   <filter class="solr.EnglishPossessiveFilterFactory"/>
   <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
   <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
   <filter class="solr.EnglishMinimalStemFilterFactory"/>
   <filter class="solr.SnowballPorterFilterFactory" language="English" />
   <filter class="solr.ASCIIFoldingFilterFactory"/>
</analyzer>
</fieldType>

<fieldType class="solr.TextField" name="text_fr" positionIncrementGap="100">
<analyzer type="index">
   <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
   <tokenizer class="solr.ClassicTokenizerFactory"/>
   <filter class="solr.SynonymFilterFactory" expand="false" ignoreCase="true" synonyms="synonyms.txt" tokenizerFactory="solr.StandardTokenizerFactory"/>
   <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_fr.txt"/>
   <filter class="solr.LowerCaseFilterFactory"/>
   <filter class="solr.StopFilterFactory" format="snowball" ignoreCase="true" words="lang/stopwords_fr.txt"/>
   <filter class="solr.FrenchMinimalStemFilterFactory"/>
   <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
   <filter class="solr.ASCIIFoldingFilterFactory"/>
</analyzer>
<analyzer type="query">
   <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
   <tokenizer class="solr.ClassicTokenizerFactory"/>
   <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_fr.txt"/>
   <filter class="solr.LowerCaseFilterFactory"/>
   <filter class="solr.StopFilterFactory" format="snowball" ignoreCase="true" words="lang/stopwords_fr.txt"/>
   <filter class="solr.FrenchMinimalStemFilterFactory"/>
   <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
   <filter class="solr.ASCIIFoldingFilterFactory"/>
</analyzer>
</fieldType>

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: pf2

Posted by Rick Leir <rl...@leirtech.com>.

Emir
Sow=false .. thanks for this! 

The problem seems to be due to a stopword. Everything is fine when I avoid stopwords in my query. The stopword might get removed in the query matching, but I would need to allow some slop perhaps for pf2.
Thanks 
Rick

On January 26, 2018 8:14:06 AM EST, "Emir Arnautović" <em...@sematext.com> wrote:
>Hi Rick,
>It does not work in any case or it does not work for some cases - e.g.
>something like l’avion? Maybe you can try use sow=false and see if it
>will help.
>
>Cheers,
>Emir
>--
>Monitoring - Log Management - Alerting - Anomaly Detection
>Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 26 Jan 2018, at 13:38, Rick Leir <rl...@leirtech.com> wrote:
>> 
>> Emir
>> Thanks, I will do when I get off this bus.
>> 
>> I have run the text thru the SolrAdmin Analyzer, it looks fine.
>> 
>> According to the debugQuery output, individual words match in the qf,
>but not the pair that pf2 should match.
>> 
>> I compare the configs for English and French, and they are the same
>apart from the analysis chain which is below. Only French fails. I will
>take out filters one by one and attempt to find which is causing this.
>> Cheers -- Rick
>> 
>> On January 26, 2018 4:09:51 AM EST, "Emir Arnautović"
><em...@sematext.com> wrote:
>>> Hi Rick,
>>> Can you include sample of your query and text that should match.
>>> 
>>> Thanks,
>>> Emir
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training -
>http://sematext.com/
>>> 
>>> 
>>> 
>>>> On 25 Jan 2018, at 23:13, Rick Leir <rl...@leirtech.com> wrote:
>>>> 
>>>> 
>>>> 
>>>> Hi all
>>>> My pf2 keywords^11.0 works for english not for french. Here are the
>>> fieldtypes, actually from two schema.xml's in separate cores. Solr
>>> 5.2.2, edismax, q.op AND
>>>> I suspect there are several problems with the french schema. Maybe
>I
>>> only needed to show the query analyzer, not the index analyzer?
>>>> 
>>>> The pf2 does not show a match in the debugQuery=true output for the
>>> French. However, a qf keywords^10.0 does show a match. The keywords
>>> field is copyfielded into text, which is the df. Is there any other
>>> field I should be showing?
>>>> Thanks
>>>> Rick
>>>> 
>>>> <fieldType class="solr.TextField" name="text_en"
>>> positionIncrementGap="100">
>>>> <analyzer type="index">
>>>>  <charFilter class="solr.MappingCharFilterFactory"
>>> mapping="mapping-ISOLatin1Accent.txt"/>
>>>>  <tokenizer class="solr.ClassicTokenizerFactory"/>
>>>>  <filter class="solr.SynonymFilterFactory" expand="false"
>>> ignoreCase="true" synonyms="synonyms.txt"
>>> tokenizerFactory="solr.StandardTokenizerFactory"/>
>>>>  <filter class="solr.StopFilterFactory" ignoreCase="true"
>>> words="lang/stopwords_en.txt"/>
>>>>  <filter class="solr.LowerCaseFilterFactory"/>
>>>>  <filter class="solr.EnglishPossessiveFilterFactory"/>
>>>>  <filter class="solr.KeywordMarkerFilterFactory"
>>> protected="protwords.txt"/>
>>>>  <filter class="solr.StemmerOverrideFilterFactory"
>>> dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
>>>>  <filter class="solr.EnglishMinimalStemFilterFactory"/>
>>>>  <filter class="solr.SnowballPorterFilterFactory"
>language="English"
>>> />
>>>>  <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>> </analyzer>
>>>> <analyzer type="query">
>>>>  <charFilter class="solr.MappingCharFilterFactory"
>>> mapping="mapping-ISOLatin1Accent.txt"/>
>>>> <tokenizer class="solr.ClassicTokenizerFactory"/>
>>>>  <filter class="solr.StopFilterFactory" ignoreCase="true"
>>> words="lang/stopwords_en.txt"/>
>>>>  <filter class="solr.LowerCaseFilterFactory"/>
>>>>  <filter class="solr.EnglishPossessiveFilterFactory"/>
>>>>  <filter class="solr.KeywordMarkerFilterFactory"
>>> protected="protwords.txt"/>
>>>>  <filter class="solr.StemmerOverrideFilterFactory"
>>> dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
>>>>  <filter class="solr.EnglishMinimalStemFilterFactory"/>
>>>>  <filter class="solr.SnowballPorterFilterFactory"
>language="English"
>>> />
>>>>  <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>> </analyzer>
>>>> </fieldType>
>>>> 
>>>> <fieldType class="solr.TextField" name="text_fr"
>>> positionIncrementGap="100">
>>>> <analyzer type="index">
>>>>  <charFilter class="solr.MappingCharFilterFactory"
>>> mapping="mapping-ISOLatin1Accent.txt"/>
>>>>  <tokenizer class="solr.ClassicTokenizerFactory"/>
>>>>  <filter class="solr.SynonymFilterFactory" expand="false"
>>> ignoreCase="true" synonyms="synonyms.txt"
>>> tokenizerFactory="solr.StandardTokenizerFactory"/>
>>>>  <filter class="solr.ElisionFilterFactory" ignoreCase="true"
>>> articles="lang/contractions_fr.txt"/>
>>>>  <filter class="solr.LowerCaseFilterFactory"/>
>>>>  <filter class="solr.StopFilterFactory" format="snowball"
>>> ignoreCase="true" words="lang/stopwords_fr.txt"/>
>>>>  <filter class="solr.FrenchMinimalStemFilterFactory"/>
>>>>  <filter class="solr.StemmerOverrideFilterFactory"
>>> dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
>>>>  <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>> </analyzer>
>>>> <analyzer type="query">
>>>>  <charFilter class="solr.MappingCharFilterFactory"
>>> mapping="mapping-ISOLatin1Accent.txt"/>
>>>>  <tokenizer class="solr.ClassicTokenizerFactory"/>
>>>>  <filter class="solr.ElisionFilterFactory" ignoreCase="true"
>>> articles="lang/contractions_fr.txt"/>
>>>>  <filter class="solr.LowerCaseFilterFactory"/>
>>>>  <filter class="solr.StopFilterFactory" format="snowball"
>>> ignoreCase="true" words="lang/stopwords_fr.txt"/>
>>>>  <filter class="solr.FrenchMinimalStemFilterFactory"/>
>>>>  <filter class="solr.StemmerOverrideFilterFactory"
>>> dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
>>>>  <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>> </analyzer>
>>>> </fieldType>
>>>> 
>>>> -- 
>>>> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>> 
>> -- 
>> Sorry for being brief. Alternate email is rickleir at yahoo dot com

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: pf2

Posted by Emir Arnautović <em...@sematext.com>.

Hi Rick,
It does not work in any case or it does not work for some cases - e.g. something like l’avion? Maybe you can try use sow=false and see if it will help.

Cheers,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 26 Jan 2018, at 13:38, Rick Leir <rl...@leirtech.com> wrote:
> 
> Emir
> Thanks, I will do when I get off this bus.
> 
> I have run the text thru the SolrAdmin Analyzer, it looks fine.
> 
> According to the debugQuery output, individual words match in the qf, but not the pair that pf2 should match.
> 
> I compare the configs for English and French, and they are the same apart from the analysis chain which is below. Only French fails. I will take out filters one by one and attempt to find which is causing this.
> Cheers -- Rick
> 
> On January 26, 2018 4:09:51 AM EST, "Emir Arnautović" <em...@sematext.com> wrote:
>> Hi Rick,
>> Can you include sample of your query and text that should match.
>> 
>> Thanks,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>> 
>> 
>> 
>>> On 25 Jan 2018, at 23:13, Rick Leir <rl...@leirtech.com> wrote:
>>> 
>>> 
>>> 
>>> Hi all
>>> My pf2 keywords^11.0 works for english not for french. Here are the
>> fieldtypes, actually from two schema.xml's in separate cores. Solr
>> 5.2.2, edismax, q.op AND
>>> I suspect there are several problems with the french schema. Maybe I
>> only needed to show the query analyzer, not the index analyzer?
>>> 
>>> The pf2 does not show a match in the debugQuery=true output for the
>> French. However, a qf keywords^10.0 does show a match. The keywords
>> field is copyfielded into text, which is the df. Is there any other
>> field I should be showing?
>>> Thanks
>>> Rick
>>> 
>>> <fieldType class="solr.TextField" name="text_en"
>> positionIncrementGap="100">
>>> <analyzer type="index">
>>>  <charFilter class="solr.MappingCharFilterFactory"
>> mapping="mapping-ISOLatin1Accent.txt"/>
>>>  <tokenizer class="solr.ClassicTokenizerFactory"/>
>>>  <filter class="solr.SynonymFilterFactory" expand="false"
>> ignoreCase="true" synonyms="synonyms.txt"
>> tokenizerFactory="solr.StandardTokenizerFactory"/>
>>>  <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="lang/stopwords_en.txt"/>
>>>  <filter class="solr.LowerCaseFilterFactory"/>
>>>  <filter class="solr.EnglishPossessiveFilterFactory"/>
>>>  <filter class="solr.KeywordMarkerFilterFactory"
>> protected="protwords.txt"/>
>>>  <filter class="solr.StemmerOverrideFilterFactory"
>> dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
>>>  <filter class="solr.EnglishMinimalStemFilterFactory"/>
>>>  <filter class="solr.SnowballPorterFilterFactory" language="English"
>> />
>>>  <filter class="solr.ASCIIFoldingFilterFactory"/>
>>> </analyzer>
>>> <analyzer type="query">
>>>  <charFilter class="solr.MappingCharFilterFactory"
>> mapping="mapping-ISOLatin1Accent.txt"/>
>>> <tokenizer class="solr.ClassicTokenizerFactory"/>
>>>  <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="lang/stopwords_en.txt"/>
>>>  <filter class="solr.LowerCaseFilterFactory"/>
>>>  <filter class="solr.EnglishPossessiveFilterFactory"/>
>>>  <filter class="solr.KeywordMarkerFilterFactory"
>> protected="protwords.txt"/>
>>>  <filter class="solr.StemmerOverrideFilterFactory"
>> dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
>>>  <filter class="solr.EnglishMinimalStemFilterFactory"/>
>>>  <filter class="solr.SnowballPorterFilterFactory" language="English"
>> />
>>>  <filter class="solr.ASCIIFoldingFilterFactory"/>
>>> </analyzer>
>>> </fieldType>
>>> 
>>> <fieldType class="solr.TextField" name="text_fr"
>> positionIncrementGap="100">
>>> <analyzer type="index">
>>>  <charFilter class="solr.MappingCharFilterFactory"
>> mapping="mapping-ISOLatin1Accent.txt"/>
>>>  <tokenizer class="solr.ClassicTokenizerFactory"/>
>>>  <filter class="solr.SynonymFilterFactory" expand="false"
>> ignoreCase="true" synonyms="synonyms.txt"
>> tokenizerFactory="solr.StandardTokenizerFactory"/>
>>>  <filter class="solr.ElisionFilterFactory" ignoreCase="true"
>> articles="lang/contractions_fr.txt"/>
>>>  <filter class="solr.LowerCaseFilterFactory"/>
>>>  <filter class="solr.StopFilterFactory" format="snowball"
>> ignoreCase="true" words="lang/stopwords_fr.txt"/>
>>>  <filter class="solr.FrenchMinimalStemFilterFactory"/>
>>>  <filter class="solr.StemmerOverrideFilterFactory"
>> dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
>>>  <filter class="solr.ASCIIFoldingFilterFactory"/>
>>> </analyzer>
>>> <analyzer type="query">
>>>  <charFilter class="solr.MappingCharFilterFactory"
>> mapping="mapping-ISOLatin1Accent.txt"/>
>>>  <tokenizer class="solr.ClassicTokenizerFactory"/>
>>>  <filter class="solr.ElisionFilterFactory" ignoreCase="true"
>> articles="lang/contractions_fr.txt"/>
>>>  <filter class="solr.LowerCaseFilterFactory"/>
>>>  <filter class="solr.StopFilterFactory" format="snowball"
>> ignoreCase="true" words="lang/stopwords_fr.txt"/>
>>>  <filter class="solr.FrenchMinimalStemFilterFactory"/>
>>>  <filter class="solr.StemmerOverrideFilterFactory"
>> dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
>>>  <filter class="solr.ASCIIFoldingFilterFactory"/>
>>> </analyzer>
>>> </fieldType>
>>> 
>>> -- 
>>> Sorry for being brief. Alternate email is rickleir at yahoo dot com
> 
> -- 
> Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: pf2

Posted by Rick Leir <rl...@leirtech.com>.

Emir
Thanks, I will do when I get off this bus.

I have run the text thru the SolrAdmin Analyzer, it looks fine.

According to the debugQuery output, individual words match in the qf, but not the pair that pf2 should match.

I compare the configs for English and French, and they are the same apart from the analysis chain which is below. Only French fails. I will take out filters one by one and attempt to find which is causing this.
Cheers -- Rick

On January 26, 2018 4:09:51 AM EST, "Emir Arnautović" <em...@sematext.com> wrote:
>Hi Rick,
>Can you include sample of your query and text that should match.
>
>Thanks,
>Emir
>--
>Monitoring - Log Management - Alerting - Anomaly Detection
>Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 25 Jan 2018, at 23:13, Rick Leir <rl...@leirtech.com> wrote:
>> 
>> 
>> 
>> Hi all
>> My pf2 keywords^11.0 works for english not for french. Here are the
>fieldtypes, actually from two schema.xml's in separate cores. Solr
>5.2.2, edismax, q.op AND
>> I suspect there are several problems with the french schema. Maybe I
>only needed to show the query analyzer, not the index analyzer?
>> 
>> The pf2 does not show a match in the debugQuery=true output for the
>French. However, a qf keywords^10.0 does show a match. The keywords
>field is copyfielded into text, which is the df. Is there any other
>field I should be showing?
>> Thanks
>> Rick
>> 
>> <fieldType class="solr.TextField" name="text_en"
>positionIncrementGap="100">
>> <analyzer type="index">
>>   <charFilter class="solr.MappingCharFilterFactory"
>mapping="mapping-ISOLatin1Accent.txt"/>
>>   <tokenizer class="solr.ClassicTokenizerFactory"/>
>>   <filter class="solr.SynonymFilterFactory" expand="false"
>ignoreCase="true" synonyms="synonyms.txt"
>tokenizerFactory="solr.StandardTokenizerFactory"/>
>>   <filter class="solr.StopFilterFactory" ignoreCase="true"
>words="lang/stopwords_en.txt"/>
>>   <filter class="solr.LowerCaseFilterFactory"/>
>>   <filter class="solr.EnglishPossessiveFilterFactory"/>
>>   <filter class="solr.KeywordMarkerFilterFactory"
>protected="protwords.txt"/>
>>   <filter class="solr.StemmerOverrideFilterFactory"
>dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
>>   <filter class="solr.EnglishMinimalStemFilterFactory"/>
>>   <filter class="solr.SnowballPorterFilterFactory" language="English"
>/>
>>   <filter class="solr.ASCIIFoldingFilterFactory"/>
>> </analyzer>
>> <analyzer type="query">
>>   <charFilter class="solr.MappingCharFilterFactory"
>mapping="mapping-ISOLatin1Accent.txt"/>
>>  <tokenizer class="solr.ClassicTokenizerFactory"/>
>>   <filter class="solr.StopFilterFactory" ignoreCase="true"
>words="lang/stopwords_en.txt"/>
>>   <filter class="solr.LowerCaseFilterFactory"/>
>>   <filter class="solr.EnglishPossessiveFilterFactory"/>
>>   <filter class="solr.KeywordMarkerFilterFactory"
>protected="protwords.txt"/>
>>   <filter class="solr.StemmerOverrideFilterFactory"
>dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
>>   <filter class="solr.EnglishMinimalStemFilterFactory"/>
>>   <filter class="solr.SnowballPorterFilterFactory" language="English"
>/>
>>   <filter class="solr.ASCIIFoldingFilterFactory"/>
>> </analyzer>
>> </fieldType>
>> 
>> <fieldType class="solr.TextField" name="text_fr"
>positionIncrementGap="100">
>> <analyzer type="index">
>>   <charFilter class="solr.MappingCharFilterFactory"
>mapping="mapping-ISOLatin1Accent.txt"/>
>>   <tokenizer class="solr.ClassicTokenizerFactory"/>
>>   <filter class="solr.SynonymFilterFactory" expand="false"
>ignoreCase="true" synonyms="synonyms.txt"
>tokenizerFactory="solr.StandardTokenizerFactory"/>
>>   <filter class="solr.ElisionFilterFactory" ignoreCase="true"
>articles="lang/contractions_fr.txt"/>
>>   <filter class="solr.LowerCaseFilterFactory"/>
>>   <filter class="solr.StopFilterFactory" format="snowball"
>ignoreCase="true" words="lang/stopwords_fr.txt"/>
>>   <filter class="solr.FrenchMinimalStemFilterFactory"/>
>>   <filter class="solr.StemmerOverrideFilterFactory"
>dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
>>   <filter class="solr.ASCIIFoldingFilterFactory"/>
>> </analyzer>
>> <analyzer type="query">
>>   <charFilter class="solr.MappingCharFilterFactory"
>mapping="mapping-ISOLatin1Accent.txt"/>
>>   <tokenizer class="solr.ClassicTokenizerFactory"/>
>>   <filter class="solr.ElisionFilterFactory" ignoreCase="true"
>articles="lang/contractions_fr.txt"/>
>>   <filter class="solr.LowerCaseFilterFactory"/>
>>   <filter class="solr.StopFilterFactory" format="snowball"
>ignoreCase="true" words="lang/stopwords_fr.txt"/>
>>   <filter class="solr.FrenchMinimalStemFilterFactory"/>
>>   <filter class="solr.StemmerOverrideFilterFactory"
>dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
>>   <filter class="solr.ASCIIFoldingFilterFactory"/>
>> </analyzer>
>> </fieldType>
>> 
>> -- 
>> Sorry for being brief. Alternate email is rickleir at yahoo dot com

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: pf2

Posted by Emir Arnautović <em...@sematext.com>.

Hi Rick,
Can you include sample of your query and text that should match.

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 25 Jan 2018, at 23:13, Rick Leir <rl...@leirtech.com> wrote:
> 
> 
> 
> Hi all
> My pf2 keywords^11.0 works for english not for french. Here are the fieldtypes, actually from two schema.xml's in separate cores. Solr 5.2.2, edismax, q.op AND
> I suspect there are several problems with the french schema. Maybe I only needed to show the query analyzer, not the index analyzer?
> 
> The pf2 does not show a match in the debugQuery=true output for the French. However, a qf keywords^10.0 does show a match. The keywords field is copyfielded into text, which is the df. Is there any other field I should be showing?
> Thanks
> Rick
> 
> <fieldType class="solr.TextField" name="text_en" positionIncrementGap="100">
> <analyzer type="index">
>   <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
>   <tokenizer class="solr.ClassicTokenizerFactory"/>
>   <filter class="solr.SynonymFilterFactory" expand="false" ignoreCase="true" synonyms="synonyms.txt" tokenizerFactory="solr.StandardTokenizerFactory"/>
>   <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
>   <filter class="solr.LowerCaseFilterFactory"/>
>   <filter class="solr.EnglishPossessiveFilterFactory"/>
>   <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
>   <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
>   <filter class="solr.EnglishMinimalStemFilterFactory"/>
>   <filter class="solr.SnowballPorterFilterFactory" language="English" />
>   <filter class="solr.ASCIIFoldingFilterFactory"/>
> </analyzer>
> <analyzer type="query">
>   <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
>  <tokenizer class="solr.ClassicTokenizerFactory"/>
>   <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
>   <filter class="solr.LowerCaseFilterFactory"/>
>   <filter class="solr.EnglishPossessiveFilterFactory"/>
>   <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
>   <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_en.txt" ignoreCase="true"/>
>   <filter class="solr.EnglishMinimalStemFilterFactory"/>
>   <filter class="solr.SnowballPorterFilterFactory" language="English" />
>   <filter class="solr.ASCIIFoldingFilterFactory"/>
> </analyzer>
> </fieldType>
> 
> <fieldType class="solr.TextField" name="text_fr" positionIncrementGap="100">
> <analyzer type="index">
>   <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
>   <tokenizer class="solr.ClassicTokenizerFactory"/>
>   <filter class="solr.SynonymFilterFactory" expand="false" ignoreCase="true" synonyms="synonyms.txt" tokenizerFactory="solr.StandardTokenizerFactory"/>
>   <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_fr.txt"/>
>   <filter class="solr.LowerCaseFilterFactory"/>
>   <filter class="solr.StopFilterFactory" format="snowball" ignoreCase="true" words="lang/stopwords_fr.txt"/>
>   <filter class="solr.FrenchMinimalStemFilterFactory"/>
>   <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
>   <filter class="solr.ASCIIFoldingFilterFactory"/>
> </analyzer>
> <analyzer type="query">
>   <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
>   <tokenizer class="solr.ClassicTokenizerFactory"/>
>   <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_fr.txt"/>
>   <filter class="solr.LowerCaseFilterFactory"/>
>   <filter class="solr.StopFilterFactory" format="snowball" ignoreCase="true" words="lang/stopwords_fr.txt"/>
>   <filter class="solr.FrenchMinimalStemFilterFactory"/>
>   <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_fr.txt" ignoreCase="true"/>
>   <filter class="solr.ASCIIFoldingFilterFactory"/>
> </analyzer>
> </fieldType>
> 
> -- 
> Sorry for being brief. Alternate email is rickleir at yahoo dot com