You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Nitin Arora <ni...@gmail.com> on 2021/04/29 18:23:41 UTC

Learning to Rank - within solr or outside?

Hello, Can someone share the pros and cons of using SOLR's learning to rank
vs having an external reranker after fetching SOLR's top results. Which
option would you recommend?

Thanks in advance,

Re: Learning to Rank - within solr or outside?

Posted by Petter Egesund <pe...@sannsyn.com>.
Hi Nitin

We rerank the top-n documents based on the original solr score from a
solr-plugin. This gives very good speed.

We use this code: https://github.com/pegesund/clojureranker.

Clojure clode - but it should not be hard to make a java-version :)

Petter

ons. 5. mai 2021 kl. 17:12 skrev Umesh Prasad <um...@gmail.com>:

> Hi Nitin,
>      In Flipkart, we used an external re-ranker (called it L2 re-ranker).
> In the first version we tried building an auxiliary store for ephemeral
> fields and plugged them inside solr scoring. It didn't scale. Retrieval of
> features from solr index is one challenge but redundant matching in 1st
> phase was also wasteful.
>    L2 re-ranker outside worked really well. We tried a bunch of other
> optimizations as well. sorted index, single segment index (built through MR
> and nightly). You can watch our talks from slashN ( Flipakrt tech
> conference).
>
> 1. Near Real-Time Indexing - Umesh Prasad & Thejus V M, Flipkart
>
> https://www.youtube.com/watch?v=05rX0mJ2N4U&list=PLf85w1fkhA5EW7KvZULMKm97REOCxsUZz&index=2
>
> 2..Resource optimisation for Search at Scale in Flipkart - Monish Gandhi
>
>
> https://www.youtube.com/watch?v=PCFJ7iZ1Uvs&list=PLf85w1fkhA5EW7KvZULMKm97REOCxsUZz&index=8
>
> These are lessons learnt from trenches in managing a large cluster and
> continuously growing traffic. Would be happy to answer any questions you
> have from talks.
>
> Thanks & Regards
> Umesh Prasad
>
> https://www.linkedin.com/in/umesh-prasad-iitk/
>
>
>
>
>
>
> On Wed, 5 May 2021 at 20:22, Alessandro Benedetti <a....@sease.io>
> wrote:
>
> > Hi Nitin,
> > based on my experience, if you have document-level features and query
> > dependent features (query-document level) using the internal Solr
> re-ranker
> > would be beneficial in terms of performance.
> > The way Solr extracts features values from the index data structures is
> > expensive but it should be much cheaper than just fetching the top-K from
> > Solr and then extracting all the feature vectors and re-ranking outside.
> > I never did an explicit benchmark comparison though, it can be an
> > interesting idea for a blog.
> >
> > Cheers
> >
> > --------------------------
> > Alessandro Benedetti
> > Apache Lucene/Solr Committer
> > Director, R&D Software Engineer, Search Consultant
> >
> > www.sease.io
> >
> >
> > On Thu, 29 Apr 2021 at 19:24, Nitin Arora <ni...@gmail.com>
> wrote:
> >
> > > Hello, Can someone share the pros and cons of using SOLR's learning to
> > rank
> > > vs having an external reranker after fetching SOLR's top results. Which
> > > option would you recommend?
> > >
> > > Thanks in advance,
> > >
> >
>

Re: Learning to Rank - within solr or outside?

Posted by Umesh Prasad <um...@gmail.com>.
Hi Nitin,
     In Flipkart, we used an external re-ranker (called it L2 re-ranker).
In the first version we tried building an auxiliary store for ephemeral
fields and plugged them inside solr scoring. It didn't scale. Retrieval of
features from solr index is one challenge but redundant matching in 1st
phase was also wasteful.
   L2 re-ranker outside worked really well. We tried a bunch of other
optimizations as well. sorted index, single segment index (built through MR
and nightly). You can watch our talks from slashN ( Flipakrt tech
conference).

1. Near Real-Time Indexing - Umesh Prasad & Thejus V M, Flipkart
https://www.youtube.com/watch?v=05rX0mJ2N4U&list=PLf85w1fkhA5EW7KvZULMKm97REOCxsUZz&index=2

2..Resource optimisation for Search at Scale in Flipkart - Monish Gandhi

https://www.youtube.com/watch?v=PCFJ7iZ1Uvs&list=PLf85w1fkhA5EW7KvZULMKm97REOCxsUZz&index=8

These are lessons learnt from trenches in managing a large cluster and
continuously growing traffic. Would be happy to answer any questions you
have from talks.

Thanks & Regards
Umesh Prasad

https://www.linkedin.com/in/umesh-prasad-iitk/






On Wed, 5 May 2021 at 20:22, Alessandro Benedetti <a....@sease.io>
wrote:

> Hi Nitin,
> based on my experience, if you have document-level features and query
> dependent features (query-document level) using the internal Solr re-ranker
> would be beneficial in terms of performance.
> The way Solr extracts features values from the index data structures is
> expensive but it should be much cheaper than just fetching the top-K from
> Solr and then extracting all the feature vectors and re-ranking outside.
> I never did an explicit benchmark comparison though, it can be an
> interesting idea for a blog.
>
> Cheers
>
> --------------------------
> Alessandro Benedetti
> Apache Lucene/Solr Committer
> Director, R&D Software Engineer, Search Consultant
>
> www.sease.io
>
>
> On Thu, 29 Apr 2021 at 19:24, Nitin Arora <ni...@gmail.com> wrote:
>
> > Hello, Can someone share the pros and cons of using SOLR's learning to
> rank
> > vs having an external reranker after fetching SOLR's top results. Which
> > option would you recommend?
> >
> > Thanks in advance,
> >
>

Re: Learning to Rank - within solr or outside?

Posted by Alessandro Benedetti <a....@sease.io>.
Hi Nitin,
based on my experience, if you have document-level features and query
dependent features (query-document level) using the internal Solr re-ranker
would be beneficial in terms of performance.
The way Solr extracts features values from the index data structures is
expensive but it should be much cheaper than just fetching the top-K from
Solr and then extracting all the feature vectors and re-ranking outside.
I never did an explicit benchmark comparison though, it can be an
interesting idea for a blog.

Cheers

--------------------------
Alessandro Benedetti
Apache Lucene/Solr Committer
Director, R&D Software Engineer, Search Consultant

www.sease.io


On Thu, 29 Apr 2021 at 19:24, Nitin Arora <ni...@gmail.com> wrote:

> Hello, Can someone share the pros and cons of using SOLR's learning to rank
> vs having an external reranker after fetching SOLR's top results. Which
> option would you recommend?
>
> Thanks in advance,
>