You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jack Krupansky <ja...@gmail.com> on 2015/03/13 22:08:12 UTC

Distributed IDF performance

Does anybody have any actual performance data or even a rough formula for
calculating the overhead for using the new Solr 5.0 Distributed IDF (
SOLR-1632 <https://issues.apache.org/jira/browse/SOLR-1632>)?

And any guidance as far as which StatsInfo plugin is best to use?

Are many people now using Distributed IDF as their default?

I'm not currently using this, but the existing doc and Jira is too minimal
to offer guidance as requested above. Mostly I'm just curious.

Thanks.

-- Jack Krupansky

Re: Distributed IDF performance

Posted by Anshum Gupta <an...@anshumgupta.net>.
np!

I forgot to mention that I didn't notice any considerable performance hit
in my tests. The QTimes were barely off by 5%.

On Fri, Mar 13, 2015 at 3:13 PM, Jack Krupansky <ja...@gmail.com>
wrote:

> Oops... I said "StatsInfo" and that should have been "StatsCache"
> ("<statsCache .../>").
>
> -- Jack Krupansky
>
> On Fri, Mar 13, 2015 at 6:04 PM, Anshum Gupta <an...@anshumgupta.net>
> wrote:
>
> > There's no rough formula or performance data that I know of at this
> point.
> > About he guidance, if you want to use Global stats, my obvious choice
> would
> > be to use the LRUStatsCache.
> > Before committing, I did run some tests on my macbook but as I said back
> > then, they shouldn't be totally taken at face value. The tests didn't
> > involve any network and were just about 20mn docs and synthetic queries.
> >
> > On Fri, Mar 13, 2015 at 2:08 PM, Jack Krupansky <
> jack.krupansky@gmail.com>
> > wrote:
> >
> > > Does anybody have any actual performance data or even a rough formula
> for
> > > calculating the overhead for using the new Solr 5.0 Distributed IDF (
> > > SOLR-1632 <https://issues.apache.org/jira/browse/SOLR-1632>)?
> > >
> > > And any guidance as far as which StatsInfo plugin is best to use?
> > >
> > > Are many people now using Distributed IDF as their default?
> > >
> > > I'm not currently using this, but the existing doc and Jira is too
> > minimal
> > > to offer guidance as requested above. Mostly I'm just curious.
> > >
> > > Thanks.
> > >
> > > -- Jack Krupansky
> > >
> >
> >
> >
> > --
> > Anshum Gupta
> >
>



-- 
Anshum Gupta

Re: Distributed IDF performance

Posted by Jack Krupansky <ja...@gmail.com>.
Oops... I said "StatsInfo" and that should have been "StatsCache"
("<statsCache .../>").

-- Jack Krupansky

On Fri, Mar 13, 2015 at 6:04 PM, Anshum Gupta <an...@anshumgupta.net>
wrote:

> There's no rough formula or performance data that I know of at this point.
> About he guidance, if you want to use Global stats, my obvious choice would
> be to use the LRUStatsCache.
> Before committing, I did run some tests on my macbook but as I said back
> then, they shouldn't be totally taken at face value. The tests didn't
> involve any network and were just about 20mn docs and synthetic queries.
>
> On Fri, Mar 13, 2015 at 2:08 PM, Jack Krupansky <ja...@gmail.com>
> wrote:
>
> > Does anybody have any actual performance data or even a rough formula for
> > calculating the overhead for using the new Solr 5.0 Distributed IDF (
> > SOLR-1632 <https://issues.apache.org/jira/browse/SOLR-1632>)?
> >
> > And any guidance as far as which StatsInfo plugin is best to use?
> >
> > Are many people now using Distributed IDF as their default?
> >
> > I'm not currently using this, but the existing doc and Jira is too
> minimal
> > to offer guidance as requested above. Mostly I'm just curious.
> >
> > Thanks.
> >
> > -- Jack Krupansky
> >
>
>
>
> --
> Anshum Gupta
>

Re: Distributed IDF performance

Posted by Anshum Gupta <an...@anshumgupta.net>.
There's no rough formula or performance data that I know of at this point.
About he guidance, if you want to use Global stats, my obvious choice would
be to use the LRUStatsCache.
Before committing, I did run some tests on my macbook but as I said back
then, they shouldn't be totally taken at face value. The tests didn't
involve any network and were just about 20mn docs and synthetic queries.

On Fri, Mar 13, 2015 at 2:08 PM, Jack Krupansky <ja...@gmail.com>
wrote:

> Does anybody have any actual performance data or even a rough formula for
> calculating the overhead for using the new Solr 5.0 Distributed IDF (
> SOLR-1632 <https://issues.apache.org/jira/browse/SOLR-1632>)?
>
> And any guidance as far as which StatsInfo plugin is best to use?
>
> Are many people now using Distributed IDF as their default?
>
> I'm not currently using this, but the existing doc and Jira is too minimal
> to offer guidance as requested above. Mostly I'm just curious.
>
> Thanks.
>
> -- Jack Krupansky
>



-- 
Anshum Gupta