You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by OTH <om...@gmail.com> on 2017/04/16 15:46:01 UTC

AnalyzingInfixSuggester performance

Hello,

From what I understand, the AnalyzingInfixSuggester is using a simple
Lucene query; so I was wondering, how then would this suggester have better
performance than using a simple Solr 'select' query on a regular Solr index
(with an asterisk placed at the start and end of the query string).  I
could understand why say an FST based suggester would be faster, but I
wanted to confirm if that indeed is the case with AnalyzingInfixSuggester.

One reason I ask is:
I needed the results to be boosted based on the value of another field;
e.g., if a user in the UK is searching for cities, then I'd need the cities
which are in the UK to be boosted.  I was able to do this with a regular
Solr index by adding something like these parameters:
defType=edismax&bq=country:UK^2.0

However, I'm not sure if this is possible with the Suggester.  Moreover -
other than the 'country' field above, there are other fields as well which
I need to be returned with the results.  Since the Suggester seems to only
allow one additional field, called 'payload', I'm able to do this by
putting the values of all the other fields into a JSON and then placing
that into the 'payload' field - however, I don't know if it would be
possible then to incorporate the boosting mechanism I showed above.

So I was thinking of just using a regular Solr index instead of the
Suggester; I wanted to confirm, what if any is the performance improvement
in using the AnalyzingInfixSuggester over using a regular index?

Much thanks

Re: AnalyzingInfixSuggester performance

Posted by Michael McCandless <lu...@mikemccandless.com>.

It also indexes edge ngrams for short sequences (e.g. a*, b*, etc.) and
switches to ordinary PrefixQuery for longer sequences, and does some work
to at search time to do the "infixing".

But yeah otherwise that's it.

If your ranking at lookup isn't exactly matching the weight, but "roughly"
has some correlation to it, you could still use the fast early termination,
except collect deeper than just the top N to ensure you likely found the
best hits according to your ranking function.

Mike McCandless

http://blog.mikemccandless.com

On Tue, Apr 18, 2017 at 4:35 PM, OTH <om...@gmail.com> wrote:

> I see.  I had actually overlooked the fact that Suggester provides a
> 'weightField', and I could possibly use that in my case instead of the
> regular Solr index with bq.
>
> So if I understand then - the main advantage of using the
> AnalyzingInfixSuggester instead of a regular Solr index (since both are
> using standard Lucene?) is that the AInfixSuggester does sorting at
> index-time using the weightField?  So it's only ever advantageous to use
> this Suggester if you need sorting based on a field?
>
> Thanks
>
> On Tue, Apr 18, 2017 at 2:20 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
> > AnalyzingInfixSuggester uses index-time sort, to sort all postings by the
> > suggest weight, so that lookup, as long as your sort by the suggest
> weight
> > is extremely fast.
> >
> > But if you need to rank at lookup time by something not "congruent" with
> > the index-time sort then you lose that benefit.
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> > On Sun, Apr 16, 2017 at 11:46 AM, OTH <om...@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > From what I understand, the AnalyzingInfixSuggester is using a simple
> > > Lucene query; so I was wondering, how then would this suggester have
> > better
> > > performance than using a simple Solr 'select' query on a regular Solr
> > index
> > > (with an asterisk placed at the start and end of the query string).  I
> > > could understand why say an FST based suggester would be faster, but I
> > > wanted to confirm if that indeed is the case with
> > AnalyzingInfixSuggester.
> > >
> > > One reason I ask is:
> > > I needed the results to be boosted based on the value of another field;
> > > e.g., if a user in the UK is searching for cities, then I'd need the
> > cities
> > > which are in the UK to be boosted.  I was able to do this with a
> regular
> > > Solr index by adding something like these parameters:
> > > defType=edismax&bq=country:UK^2.0
> > >
> > > However, I'm not sure if this is possible with the Suggester.
> Moreover -
> > > other than the 'country' field above, there are other fields as well
> > which
> > > I need to be returned with the results.  Since the Suggester seems to
> > only
> > > allow one additional field, called 'payload', I'm able to do this by
> > > putting the values of all the other fields into a JSON and then placing
> > > that into the 'payload' field - however, I don't know if it would be
> > > possible then to incorporate the boosting mechanism I showed above.
> > >
> > > So I was thinking of just using a regular Solr index instead of the
> > > Suggester; I wanted to confirm, what if any is the performance
> > improvement
> > > in using the AnalyzingInfixSuggester over using a regular index?
> > >
> > > Much thanks
> > >
> >
>

Re: AnalyzingInfixSuggester performance

Posted by OTH <om...@gmail.com>.

I see.  I had actually overlooked the fact that Suggester provides a
'weightField', and I could possibly use that in my case instead of the
regular Solr index with bq.

So if I understand then - the main advantage of using the
AnalyzingInfixSuggester instead of a regular Solr index (since both are
using standard Lucene?) is that the AInfixSuggester does sorting at
index-time using the weightField?  So it's only ever advantageous to use
this Suggester if you need sorting based on a field?

Thanks

On Tue, Apr 18, 2017 at 2:20 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> AnalyzingInfixSuggester uses index-time sort, to sort all postings by the
> suggest weight, so that lookup, as long as your sort by the suggest weight
> is extremely fast.
>
> But if you need to rank at lookup time by something not "congruent" with
> the index-time sort then you lose that benefit.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Sun, Apr 16, 2017 at 11:46 AM, OTH <om...@gmail.com> wrote:
>
> > Hello,
> >
> > From what I understand, the AnalyzingInfixSuggester is using a simple
> > Lucene query; so I was wondering, how then would this suggester have
> better
> > performance than using a simple Solr 'select' query on a regular Solr
> index
> > (with an asterisk placed at the start and end of the query string).  I
> > could understand why say an FST based suggester would be faster, but I
> > wanted to confirm if that indeed is the case with
> AnalyzingInfixSuggester.
> >
> > One reason I ask is:
> > I needed the results to be boosted based on the value of another field;
> > e.g., if a user in the UK is searching for cities, then I'd need the
> cities
> > which are in the UK to be boosted.  I was able to do this with a regular
> > Solr index by adding something like these parameters:
> > defType=edismax&bq=country:UK^2.0
> >
> > However, I'm not sure if this is possible with the Suggester.  Moreover -
> > other than the 'country' field above, there are other fields as well
> which
> > I need to be returned with the results.  Since the Suggester seems to
> only
> > allow one additional field, called 'payload', I'm able to do this by
> > putting the values of all the other fields into a JSON and then placing
> > that into the 'payload' field - however, I don't know if it would be
> > possible then to incorporate the boosting mechanism I showed above.
> >
> > So I was thinking of just using a regular Solr index instead of the
> > Suggester; I wanted to confirm, what if any is the performance
> improvement
> > in using the AnalyzingInfixSuggester over using a regular index?
> >
> > Much thanks
> >
>

Re: AnalyzingInfixSuggester performance

Posted by Michael McCandless <lu...@mikemccandless.com>.

AnalyzingInfixSuggester uses index-time sort, to sort all postings by the
suggest weight, so that lookup, as long as your sort by the suggest weight
is extremely fast.

But if you need to rank at lookup time by something not "congruent" with
the index-time sort then you lose that benefit.

Mike McCandless

http://blog.mikemccandless.com

On Sun, Apr 16, 2017 at 11:46 AM, OTH <om...@gmail.com> wrote:

> Hello,
>
> From what I understand, the AnalyzingInfixSuggester is using a simple
> Lucene query; so I was wondering, how then would this suggester have better
> performance than using a simple Solr 'select' query on a regular Solr index
> (with an asterisk placed at the start and end of the query string).  I
> could understand why say an FST based suggester would be faster, but I
> wanted to confirm if that indeed is the case with AnalyzingInfixSuggester.
>
> One reason I ask is:
> I needed the results to be boosted based on the value of another field;
> e.g., if a user in the UK is searching for cities, then I'd need the cities
> which are in the UK to be boosted.  I was able to do this with a regular
> Solr index by adding something like these parameters:
> defType=edismax&bq=country:UK^2.0
>
> However, I'm not sure if this is possible with the Suggester.  Moreover -
> other than the 'country' field above, there are other fields as well which
> I need to be returned with the results.  Since the Suggester seems to only
> allow one additional field, called 'payload', I'm able to do this by
> putting the values of all the other fields into a JSON and then placing
> that into the 'payload' field - however, I don't know if it would be
> possible then to incorporate the boosting mechanism I showed above.
>
> So I was thinking of just using a regular Solr index instead of the
> Suggester; I wanted to confirm, what if any is the performance improvement
> in using the AnalyzingInfixSuggester over using a regular index?
>
> Much thanks
>