You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Joel Bernstein <jo...@gmail.com> on 2013/11/09 18:00:51 UTC

Re: Solr grouping performance porblem

Shamik,

The CollapsingQParserPlugin will be available in Solr 4.6 and it should
perform much better when collapsing on a high cardinality field. The 4.6
code doesn't directly port back to Solr 4.4 though due to some changes in
the build for 4.6. The jira ticket has a conversation about this though and
you may be able to follow it and create a patch for 4.4.

Joel


On Thu, Oct 31, 2013 at 1:37 AM, Shamik Bandopadhyay <sh...@gmail.com>wrote:

> Hi,
>
>    I've recently upgraded to SolrCloud (4.4) from Master-Slave mode. One of
> the changes I did the in queries is to add group functionality to remove
> duplicate results. The grouping is done on a specific field. But the change
> seemed to have a huge effect on the query performance. The "group" option
> decreased the performance by 10 times. For e.g. this query takes 1 sec to
> execute. The number of results is around 105387.
>
>
> http://localhost:8083/solr/browse?fq=language:(english)&wt=xml&rows=10&start=0&fq=(ContentGroup-local
> :"Learn
> & Explore" OR ADSKContentGroup-local:"Getting Started")&q=line&sort=score
> desc&group=true&group.field=dedup&group.ngroups=true
>
> If I exclude group option, it comes down to 190ms
>
>
> http://localhost:8083/solr/browse?fq=language:(english)&wt=xml&rows=10&start=0&fq=(ContentGroup-local
> :"Learn
> & Explore" OR ADSKContentGroup-local:"Getting Started")&q=line
>
> I'm running this query against a 8 million doc index . I've 2 shard with 1
> replica each, running on a m1x.large EC2 instance, each having 8gb allocat
> ed memory.
>
> Is this a known issue or am I missing something which is making this query
> expensive.
>
> I bumped into this JIRA -->
> https://issues.apache.org/jira/browse/SOLR-5027 which
> talks about CollapsingQParserPlugin as an alternate to grouping, but that
> seemed to be available in 4.6. Just wondering if it can be an alternate in
> my case and whether if its possible to apply as a patch in 4.4 version.
>
> Any pointer will be appreciated.
>
> - Thanks,
> Shamik
>



-- 
Joel Bernstein
Search Engineer at Heliosearch

Re: Solr grouping performance porblem

Posted by Erick Erickson <er...@gmail.com>.
In fact, there's some movement towards starting the release process this
week, stay tuned!

Erick


On Mon, Nov 11, 2013 at 4:12 PM, shamik <sh...@gmail.com> wrote:

> Thanks Joel, appreciate your help. Is Solr 4.6 due this year ?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-grouping-performance-porblem-tp4098565p4100358.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr grouping performance porblem

Posted by shamik <sh...@gmail.com>.
Thanks for the update Shawn, will look forward to the release.



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-grouping-performance-porblem-tp4098565p4101314.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr grouping performance porblem

Posted by Shawn Heisey <so...@elyograg.org>.
On 11/11/2013 2:12 PM, shamik wrote:
> Thanks Joel, appreciate your help. Is Solr 4.6 due this year ?

The job of release manager for 4.6 has already been claimed.  There 
should be a release candidate posted on the dev list sometime on 
November 12th (tomorrow) in the USA timezones, unless a serious problem 
is discovered.

After the RC gets posted, there is a 72-hour voting period where 
committers vote whether or not to release that version.  If someone 
finds a problem that warrants a negative vote during that 72 hour 
period, it will be put on hold until the problem is fixed.  A new RC 
will eventually be made available and the 72-hour voting period will 
begin again.  When the vote finally passes, the release process will 
begin.  It typically takes 2-3 days after that before the official 
announcement is made.

What this means in real terms is that 4.6 will most likely be out before 
the end of November.  It would take a major series of bugs and problems 
tokeep that from happening.

Because of the upcoming holiday madness, I think 4.7 is not likely to 
happen before next year.

Thanks,
Shawn


Re: Solr grouping performance porblem

Posted by shamik <sh...@gmail.com>.
Thanks Joel, appreciate your help. Is Solr 4.6 due this year ?



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-grouping-performance-porblem-tp4098565p4100358.html
Sent from the Solr - User mailing list archive at Nabble.com.