You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by elisabeth benoit <el...@gmail.com> on 2015/12/14 14:52:37 UTC

pf2 pf3 and stopwords

Hello,

I am using solr 4.10.1. I have a field with stopwords


<filter class="solr.StopFilterFactory" ignoreCase="1" words="stopwords.txt"
enablePositionIncrements="true"/>

And I use pf2 pf3 on that field with a slop of 0.

If the request is "Gare Saint Lazare", and I have a document "Gare de Saint
Lazare", "de" being a stopword, this document doesn't get the pf3 boost,
because of "de".

I was wondering, is this normal? is this a bug? is something wrong with my
configuration?

Best regards,
Elisabeth

Re: pf2 pf3 and stopwords

Posted by elisabeth benoit <el...@gmail.com>.
ok, thanks a lot for your advice.

i'll try that.



2015-12-17 10:05 GMT+01:00 Binoy Dalal <bi...@gmail.com>:

> For this case of inversion in particular a slop of 1 won't cause any issues
> since such a reverse match will require the slop to be 2
>
> On Thu, 17 Dec 2015, 14:20 elisabeth benoit <el...@gmail.com>
> wrote:
>
> > Inversion (paris charonne or charonne paris) cannot be scored the same.
> >
> > 2015-12-16 11:08 GMT+01:00 Binoy Dalal <bi...@gmail.com>:
> >
> > > What is your exact use case?
> > >
> > > On Wed, 16 Dec 2015, 13:40 elisabeth benoit <elisaelisaelisa@gmail.com
> >
> > > wrote:
> > >
> > > > Thanks for your answer.
> > > >
> > > > Actually, using a slop of 1 is something I can't do (because of other
> > > > specifications)
> > > >
> > > > I guess I'll index differently.
> > > >
> > > > Best regards,
> > > > Elisabeth
> > > >
> > > > 2015-12-14 16:24 GMT+01:00 Binoy Dalal <bi...@gmail.com>:
> > > >
> > > > > Moreover, the stopword de will work on your queries and not on your
> > > > > documents, meaning if you query 'Gare de Saint Lazare', the terms
> > > > actually
> > > > > searched for will be Gare Saint and Lazare, 'de' will be filtered
> > out.
> > > > >
> > > > > On Mon, Dec 14, 2015 at 8:49 PM Binoy Dalal <
> binoydalal93@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > This isn't a bug. During pf3 matching, since your query has only
> > > three
> > > > > > tokens, the entire query will be treated as a single phrase, and
> > with
> > > > > slop
> > > > > > = 0, any word that comes in the middle of your query  - 'de' in
> > this
> > > > case
> > > > > > will cause the phrase to not be matched. If you want to get
> around
> > > > this,
> > > > > > try setting your slop = 1 in which case it should match Gare
> Saint
> > > > Lazare
> > > > > > even with the de in it.
> > > > > >
> > > > > > On Mon, Dec 14, 2015 at 7:22 PM elisabeth benoit <
> > > > > > elisaelisaelisa@gmail.com> wrote:
> > > > > >
> > > > > >> Hello,
> > > > > >>
> > > > > >> I am using solr 4.10.1. I have a field with stopwords
> > > > > >>
> > > > > >>
> > > > > >> <filter class="solr.StopFilterFactory" ignoreCase="1"
> > > > > >> words="stopwords.txt"
> > > > > >> enablePositionIncrements="true"/>
> > > > > >>
> > > > > >> And I use pf2 pf3 on that field with a slop of 0.
> > > > > >>
> > > > > >> If the request is "Gare Saint Lazare", and I have a document
> "Gare
> > > de
> > > > > >> Saint
> > > > > >> Lazare", "de" being a stopword, this document doesn't get the
> pf3
> > > > boost,
> > > > > >> because of "de".
> > > > > >>
> > > > > >> I was wondering, is this normal? is this a bug? is something
> wrong
> > > > with
> > > > > my
> > > > > >> configuration?
> > > > > >>
> > > > > >> Best regards,
> > > > > >> Elisabeth
> > > > > >>
> > > > > > --
> > > > > > Regards,
> > > > > > Binoy Dalal
> > > > > >
> > > > > --
> > > > > Regards,
> > > > > Binoy Dalal
> > > > >
> > > >
> > > --
> > > Regards,
> > > Binoy Dalal
> > >
> >
> --
> Regards,
> Binoy Dalal
>

Re: pf2 pf3 and stopwords

Posted by Binoy Dalal <bi...@gmail.com>.
For this case of inversion in particular a slop of 1 won't cause any issues
since such a reverse match will require the slop to be 2

On Thu, 17 Dec 2015, 14:20 elisabeth benoit <el...@gmail.com>
wrote:

> Inversion (paris charonne or charonne paris) cannot be scored the same.
>
> 2015-12-16 11:08 GMT+01:00 Binoy Dalal <bi...@gmail.com>:
>
> > What is your exact use case?
> >
> > On Wed, 16 Dec 2015, 13:40 elisabeth benoit <el...@gmail.com>
> > wrote:
> >
> > > Thanks for your answer.
> > >
> > > Actually, using a slop of 1 is something I can't do (because of other
> > > specifications)
> > >
> > > I guess I'll index differently.
> > >
> > > Best regards,
> > > Elisabeth
> > >
> > > 2015-12-14 16:24 GMT+01:00 Binoy Dalal <bi...@gmail.com>:
> > >
> > > > Moreover, the stopword de will work on your queries and not on your
> > > > documents, meaning if you query 'Gare de Saint Lazare', the terms
> > > actually
> > > > searched for will be Gare Saint and Lazare, 'de' will be filtered
> out.
> > > >
> > > > On Mon, Dec 14, 2015 at 8:49 PM Binoy Dalal <bi...@gmail.com>
> > > > wrote:
> > > >
> > > > > This isn't a bug. During pf3 matching, since your query has only
> > three
> > > > > tokens, the entire query will be treated as a single phrase, and
> with
> > > > slop
> > > > > = 0, any word that comes in the middle of your query  - 'de' in
> this
> > > case
> > > > > will cause the phrase to not be matched. If you want to get around
> > > this,
> > > > > try setting your slop = 1 in which case it should match Gare Saint
> > > Lazare
> > > > > even with the de in it.
> > > > >
> > > > > On Mon, Dec 14, 2015 at 7:22 PM elisabeth benoit <
> > > > > elisaelisaelisa@gmail.com> wrote:
> > > > >
> > > > >> Hello,
> > > > >>
> > > > >> I am using solr 4.10.1. I have a field with stopwords
> > > > >>
> > > > >>
> > > > >> <filter class="solr.StopFilterFactory" ignoreCase="1"
> > > > >> words="stopwords.txt"
> > > > >> enablePositionIncrements="true"/>
> > > > >>
> > > > >> And I use pf2 pf3 on that field with a slop of 0.
> > > > >>
> > > > >> If the request is "Gare Saint Lazare", and I have a document "Gare
> > de
> > > > >> Saint
> > > > >> Lazare", "de" being a stopword, this document doesn't get the pf3
> > > boost,
> > > > >> because of "de".
> > > > >>
> > > > >> I was wondering, is this normal? is this a bug? is something wrong
> > > with
> > > > my
> > > > >> configuration?
> > > > >>
> > > > >> Best regards,
> > > > >> Elisabeth
> > > > >>
> > > > > --
> > > > > Regards,
> > > > > Binoy Dalal
> > > > >
> > > > --
> > > > Regards,
> > > > Binoy Dalal
> > > >
> > >
> > --
> > Regards,
> > Binoy Dalal
> >
>
-- 
Regards,
Binoy Dalal

Re: pf2 pf3 and stopwords

Posted by elisabeth benoit <el...@gmail.com>.
Inversion (paris charonne or charonne paris) cannot be scored the same.

2015-12-16 11:08 GMT+01:00 Binoy Dalal <bi...@gmail.com>:

> What is your exact use case?
>
> On Wed, 16 Dec 2015, 13:40 elisabeth benoit <el...@gmail.com>
> wrote:
>
> > Thanks for your answer.
> >
> > Actually, using a slop of 1 is something I can't do (because of other
> > specifications)
> >
> > I guess I'll index differently.
> >
> > Best regards,
> > Elisabeth
> >
> > 2015-12-14 16:24 GMT+01:00 Binoy Dalal <bi...@gmail.com>:
> >
> > > Moreover, the stopword de will work on your queries and not on your
> > > documents, meaning if you query 'Gare de Saint Lazare', the terms
> > actually
> > > searched for will be Gare Saint and Lazare, 'de' will be filtered out.
> > >
> > > On Mon, Dec 14, 2015 at 8:49 PM Binoy Dalal <bi...@gmail.com>
> > > wrote:
> > >
> > > > This isn't a bug. During pf3 matching, since your query has only
> three
> > > > tokens, the entire query will be treated as a single phrase, and with
> > > slop
> > > > = 0, any word that comes in the middle of your query  - 'de' in this
> > case
> > > > will cause the phrase to not be matched. If you want to get around
> > this,
> > > > try setting your slop = 1 in which case it should match Gare Saint
> > Lazare
> > > > even with the de in it.
> > > >
> > > > On Mon, Dec 14, 2015 at 7:22 PM elisabeth benoit <
> > > > elisaelisaelisa@gmail.com> wrote:
> > > >
> > > >> Hello,
> > > >>
> > > >> I am using solr 4.10.1. I have a field with stopwords
> > > >>
> > > >>
> > > >> <filter class="solr.StopFilterFactory" ignoreCase="1"
> > > >> words="stopwords.txt"
> > > >> enablePositionIncrements="true"/>
> > > >>
> > > >> And I use pf2 pf3 on that field with a slop of 0.
> > > >>
> > > >> If the request is "Gare Saint Lazare", and I have a document "Gare
> de
> > > >> Saint
> > > >> Lazare", "de" being a stopword, this document doesn't get the pf3
> > boost,
> > > >> because of "de".
> > > >>
> > > >> I was wondering, is this normal? is this a bug? is something wrong
> > with
> > > my
> > > >> configuration?
> > > >>
> > > >> Best regards,
> > > >> Elisabeth
> > > >>
> > > > --
> > > > Regards,
> > > > Binoy Dalal
> > > >
> > > --
> > > Regards,
> > > Binoy Dalal
> > >
> >
> --
> Regards,
> Binoy Dalal
>

Re: pf2 pf3 and stopwords

Posted by Binoy Dalal <bi...@gmail.com>.
What is your exact use case?

On Wed, 16 Dec 2015, 13:40 elisabeth benoit <el...@gmail.com>
wrote:

> Thanks for your answer.
>
> Actually, using a slop of 1 is something I can't do (because of other
> specifications)
>
> I guess I'll index differently.
>
> Best regards,
> Elisabeth
>
> 2015-12-14 16:24 GMT+01:00 Binoy Dalal <bi...@gmail.com>:
>
> > Moreover, the stopword de will work on your queries and not on your
> > documents, meaning if you query 'Gare de Saint Lazare', the terms
> actually
> > searched for will be Gare Saint and Lazare, 'de' will be filtered out.
> >
> > On Mon, Dec 14, 2015 at 8:49 PM Binoy Dalal <bi...@gmail.com>
> > wrote:
> >
> > > This isn't a bug. During pf3 matching, since your query has only three
> > > tokens, the entire query will be treated as a single phrase, and with
> > slop
> > > = 0, any word that comes in the middle of your query  - 'de' in this
> case
> > > will cause the phrase to not be matched. If you want to get around
> this,
> > > try setting your slop = 1 in which case it should match Gare Saint
> Lazare
> > > even with the de in it.
> > >
> > > On Mon, Dec 14, 2015 at 7:22 PM elisabeth benoit <
> > > elisaelisaelisa@gmail.com> wrote:
> > >
> > >> Hello,
> > >>
> > >> I am using solr 4.10.1. I have a field with stopwords
> > >>
> > >>
> > >> <filter class="solr.StopFilterFactory" ignoreCase="1"
> > >> words="stopwords.txt"
> > >> enablePositionIncrements="true"/>
> > >>
> > >> And I use pf2 pf3 on that field with a slop of 0.
> > >>
> > >> If the request is "Gare Saint Lazare", and I have a document "Gare de
> > >> Saint
> > >> Lazare", "de" being a stopword, this document doesn't get the pf3
> boost,
> > >> because of "de".
> > >>
> > >> I was wondering, is this normal? is this a bug? is something wrong
> with
> > my
> > >> configuration?
> > >>
> > >> Best regards,
> > >> Elisabeth
> > >>
> > > --
> > > Regards,
> > > Binoy Dalal
> > >
> > --
> > Regards,
> > Binoy Dalal
> >
>
-- 
Regards,
Binoy Dalal

Re: pf2 pf3 and stopwords

Posted by elisabeth benoit <el...@gmail.com>.
Thanks for your answer.

Actually, using a slop of 1 is something I can't do (because of other
specifications)

I guess I'll index differently.

Best regards,
Elisabeth

2015-12-14 16:24 GMT+01:00 Binoy Dalal <bi...@gmail.com>:

> Moreover, the stopword de will work on your queries and not on your
> documents, meaning if you query 'Gare de Saint Lazare', the terms actually
> searched for will be Gare Saint and Lazare, 'de' will be filtered out.
>
> On Mon, Dec 14, 2015 at 8:49 PM Binoy Dalal <bi...@gmail.com>
> wrote:
>
> > This isn't a bug. During pf3 matching, since your query has only three
> > tokens, the entire query will be treated as a single phrase, and with
> slop
> > = 0, any word that comes in the middle of your query  - 'de' in this case
> > will cause the phrase to not be matched. If you want to get around this,
> > try setting your slop = 1 in which case it should match Gare Saint Lazare
> > even with the de in it.
> >
> > On Mon, Dec 14, 2015 at 7:22 PM elisabeth benoit <
> > elisaelisaelisa@gmail.com> wrote:
> >
> >> Hello,
> >>
> >> I am using solr 4.10.1. I have a field with stopwords
> >>
> >>
> >> <filter class="solr.StopFilterFactory" ignoreCase="1"
> >> words="stopwords.txt"
> >> enablePositionIncrements="true"/>
> >>
> >> And I use pf2 pf3 on that field with a slop of 0.
> >>
> >> If the request is "Gare Saint Lazare", and I have a document "Gare de
> >> Saint
> >> Lazare", "de" being a stopword, this document doesn't get the pf3 boost,
> >> because of "de".
> >>
> >> I was wondering, is this normal? is this a bug? is something wrong with
> my
> >> configuration?
> >>
> >> Best regards,
> >> Elisabeth
> >>
> > --
> > Regards,
> > Binoy Dalal
> >
> --
> Regards,
> Binoy Dalal
>

Re: pf2 pf3 and stopwords

Posted by Binoy Dalal <bi...@gmail.com>.
Moreover, the stopword de will work on your queries and not on your
documents, meaning if you query 'Gare de Saint Lazare', the terms actually
searched for will be Gare Saint and Lazare, 'de' will be filtered out.

On Mon, Dec 14, 2015 at 8:49 PM Binoy Dalal <bi...@gmail.com> wrote:

> This isn't a bug. During pf3 matching, since your query has only three
> tokens, the entire query will be treated as a single phrase, and with slop
> = 0, any word that comes in the middle of your query  - 'de' in this case
> will cause the phrase to not be matched. If you want to get around this,
> try setting your slop = 1 in which case it should match Gare Saint Lazare
> even with the de in it.
>
> On Mon, Dec 14, 2015 at 7:22 PM elisabeth benoit <
> elisaelisaelisa@gmail.com> wrote:
>
>> Hello,
>>
>> I am using solr 4.10.1. I have a field with stopwords
>>
>>
>> <filter class="solr.StopFilterFactory" ignoreCase="1"
>> words="stopwords.txt"
>> enablePositionIncrements="true"/>
>>
>> And I use pf2 pf3 on that field with a slop of 0.
>>
>> If the request is "Gare Saint Lazare", and I have a document "Gare de
>> Saint
>> Lazare", "de" being a stopword, this document doesn't get the pf3 boost,
>> because of "de".
>>
>> I was wondering, is this normal? is this a bug? is something wrong with my
>> configuration?
>>
>> Best regards,
>> Elisabeth
>>
> --
> Regards,
> Binoy Dalal
>
-- 
Regards,
Binoy Dalal

Re: pf2 pf3 and stopwords

Posted by Binoy Dalal <bi...@gmail.com>.
This isn't a bug. During pf3 matching, since your query has only three
tokens, the entire query will be treated as a single phrase, and with slop
= 0, any word that comes in the middle of your query  - 'de' in this case
will cause the phrase to not be matched. If you want to get around this,
try setting your slop = 1 in which case it should match Gare Saint Lazare
even with the de in it.

On Mon, Dec 14, 2015 at 7:22 PM elisabeth benoit <el...@gmail.com>
wrote:

> Hello,
>
> I am using solr 4.10.1. I have a field with stopwords
>
>
> <filter class="solr.StopFilterFactory" ignoreCase="1" words="stopwords.txt"
> enablePositionIncrements="true"/>
>
> And I use pf2 pf3 on that field with a slop of 0.
>
> If the request is "Gare Saint Lazare", and I have a document "Gare de Saint
> Lazare", "de" being a stopword, this document doesn't get the pf3 boost,
> because of "de".
>
> I was wondering, is this normal? is this a bug? is something wrong with my
> configuration?
>
> Best regards,
> Elisabeth
>
-- 
Regards,
Binoy Dalal