You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Peter Wolanin <pe...@acquia.com> on 2010/01/03 21:37:01 UTC

Re: SOLR Performance Tuning: Pagination

At the NOVA Apache Lucene/Solr Meetup last May, one of the speakers
from Near Infinity (Aaron McCurry I think) mentioned that he had a
patch for lucene that enabled unlimited depth memory-efficient paging.
 Is anyone in contact with him?

-Peter

On Thu, Dec 24, 2009 at 11:27 AM, Grant Ingersoll <gs...@apache.org> wrote:
>
> On Dec 24, 2009, at 11:09 AM, Fuad Efendi wrote:
>
>> I used pagination for a while till found this...
>>
>>
>> I have filtered query ID:[* TO *] returning 20 millions results (no
>> faceting), and pagination always seemed to be fast. However, fast only with
>> low values for start=12345. Queries like start=28838540 take 40-60 seconds,
>> and even cause OutOfMemoryException.
>
> Yeah, deep pagination in Lucene/Solr can be problematic due to the Priority Queue management.  See http://issues.apache.org/jira/browse/LUCENE-2127 and the linked discussion on java-dev.
>
>>
>> I use highlight, faceting on nontokenized "Country" field, standard handler.
>>
>>
>> It even seems to be a bug...
>>
>>
>> Fuad Efendi
>> +1 416-993-2060
>> http://www.linkedin.com/in/liferay
>>
>> Tokenizer Inc.
>> http://www.tokenizer.ca/
>> Data Mining, Vertical Search
>>
>>
>>
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
>
>



-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wolanin@acquia.com

Re: SOLR Performance Tuning: Pagination

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Si si, that issue.
 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Peter Wolanin <pe...@acquia.com>
> To: solr-user@lucene.apache.org
> Sent: Thu, January 7, 2010 9:27:04 PM
> Subject: Re: SOLR Performance Tuning: Pagination
> 
> Great - this issue?  https://issues.apache.org/jira/browse/LUCENE-2127
> 
> Sounds like it would be a real win for lucene.
> 
> -Peter
> 
> On Thu, Jan 7, 2010 at 4:12 PM, Otis Gospodnetic
> wrote:
> > Peter - Aaron just commented on a recent Solr issue (reading large result 
> sets) and mentioned his patch.
> > So far he has 2 x +1 from Grant and me to stick his patch in JIRA.
> >
> >  Otis
> > --
> > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
> >
> >
> >
> > ----- Original Message ----
> >> From: Peter Wolanin 
> >> To: solr-user@lucene.apache.org
> >> Sent: Sun, January 3, 2010 3:37:01 PM
> >> Subject: Re: SOLR Performance Tuning: Pagination
> >>
> >> At the NOVA Apache Lucene/Solr Meetup last May, one of the speakers
> >> from Near Infinity (Aaron McCurry I think) mentioned that he had a
> >> patch for lucene that enabled unlimited depth memory-efficient paging.
> >> Is anyone in contact with him?
> >>
> >> -Peter
> >>
> >> On Thu, Dec 24, 2009 at 11:27 AM, Grant Ingersoll wrote:
> >> >
> >> > On Dec 24, 2009, at 11:09 AM, Fuad Efendi wrote:
> >> >
> >> >> I used pagination for a while till found this...
> >> >>
> >> >>
> >> >> I have filtered query ID:[* TO *] returning 20 millions results (no
> >> >> faceting), and pagination always seemed to be fast. However, fast only 
> with
> >> >> low values for start=12345. Queries like start=28838540 take 40-60 
> seconds,
> >> >> and even cause OutOfMemoryException.
> >> >
> >> > Yeah, deep pagination in Lucene/Solr can be problematic due to the Priority
> >> Queue management.  See http://issues.apache.org/jira/browse/LUCENE-2127 and 
> the
> >> linked discussion on java-dev.
> >> >
> >> >>
> >> >> I use highlight, faceting on nontokenized "Country" field, standard 
> handler.
> >> >>
> >> >>
> >> >> It even seems to be a bug...
> >> >>
> >> >>
> >> >> Fuad Efendi
> >> >> +1 416-993-2060
> >> >> http://www.linkedin.com/in/liferay
> >> >>
> >> >> Tokenizer Inc.
> >> >> http://www.tokenizer.ca/
> >> >> Data Mining, Vertical Search
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >> > --------------------------
> >> > Grant Ingersoll
> >> > http://www.lucidimagination.com/
> >> >
> >> > Search the Lucene ecosystem using Solr/Lucene:
> >> http://www.lucidimagination.com/search
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Peter M. Wolanin, Ph.D.
> >> Momentum Specialist,  Acquia. Inc.
> >> peter.wolanin@acquia.com
> >
> >
> 
> 
> 
> -- 
> Peter M. Wolanin, Ph.D.
> Momentum Specialist,  Acquia. Inc.
> peter.wolanin@acquia.com


Re: SOLR Performance Tuning: Pagination

Posted by Peter Wolanin <pe...@acquia.com>.
Great - this issue?  https://issues.apache.org/jira/browse/LUCENE-2127

Sounds like it would be a real win for lucene.

-Peter

On Thu, Jan 7, 2010 at 4:12 PM, Otis Gospodnetic
<ot...@yahoo.com> wrote:
> Peter - Aaron just commented on a recent Solr issue (reading large result sets) and mentioned his patch.
> So far he has 2 x +1 from Grant and me to stick his patch in JIRA.
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>
>
>
> ----- Original Message ----
>> From: Peter Wolanin <pe...@acquia.com>
>> To: solr-user@lucene.apache.org
>> Sent: Sun, January 3, 2010 3:37:01 PM
>> Subject: Re: SOLR Performance Tuning: Pagination
>>
>> At the NOVA Apache Lucene/Solr Meetup last May, one of the speakers
>> from Near Infinity (Aaron McCurry I think) mentioned that he had a
>> patch for lucene that enabled unlimited depth memory-efficient paging.
>> Is anyone in contact with him?
>>
>> -Peter
>>
>> On Thu, Dec 24, 2009 at 11:27 AM, Grant Ingersoll wrote:
>> >
>> > On Dec 24, 2009, at 11:09 AM, Fuad Efendi wrote:
>> >
>> >> I used pagination for a while till found this...
>> >>
>> >>
>> >> I have filtered query ID:[* TO *] returning 20 millions results (no
>> >> faceting), and pagination always seemed to be fast. However, fast only with
>> >> low values for start=12345. Queries like start=28838540 take 40-60 seconds,
>> >> and even cause OutOfMemoryException.
>> >
>> > Yeah, deep pagination in Lucene/Solr can be problematic due to the Priority
>> Queue management.  See http://issues.apache.org/jira/browse/LUCENE-2127 and the
>> linked discussion on java-dev.
>> >
>> >>
>> >> I use highlight, faceting on nontokenized "Country" field, standard handler.
>> >>
>> >>
>> >> It even seems to be a bug...
>> >>
>> >>
>> >> Fuad Efendi
>> >> +1 416-993-2060
>> >> http://www.linkedin.com/in/liferay
>> >>
>> >> Tokenizer Inc.
>> >> http://www.tokenizer.ca/
>> >> Data Mining, Vertical Search
>> >>
>> >>
>> >>
>> >>
>> >
>> > --------------------------
>> > Grant Ingersoll
>> > http://www.lucidimagination.com/
>> >
>> > Search the Lucene ecosystem using Solr/Lucene:
>> http://www.lucidimagination.com/search
>> >
>> >
>>
>>
>>
>> --
>> Peter M. Wolanin, Ph.D.
>> Momentum Specialist,  Acquia. Inc.
>> peter.wolanin@acquia.com
>
>



-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wolanin@acquia.com

Re: SOLR Performance Tuning: Pagination

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Peter - Aaron just commented on a recent Solr issue (reading large result sets) and mentioned his patch.
So far he has 2 x +1 from Grant and me to stick his patch in JIRA.

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Peter Wolanin <pe...@acquia.com>
> To: solr-user@lucene.apache.org
> Sent: Sun, January 3, 2010 3:37:01 PM
> Subject: Re: SOLR Performance Tuning: Pagination
> 
> At the NOVA Apache Lucene/Solr Meetup last May, one of the speakers
> from Near Infinity (Aaron McCurry I think) mentioned that he had a
> patch for lucene that enabled unlimited depth memory-efficient paging.
> Is anyone in contact with him?
> 
> -Peter
> 
> On Thu, Dec 24, 2009 at 11:27 AM, Grant Ingersoll wrote:
> >
> > On Dec 24, 2009, at 11:09 AM, Fuad Efendi wrote:
> >
> >> I used pagination for a while till found this...
> >>
> >>
> >> I have filtered query ID:[* TO *] returning 20 millions results (no
> >> faceting), and pagination always seemed to be fast. However, fast only with
> >> low values for start=12345. Queries like start=28838540 take 40-60 seconds,
> >> and even cause OutOfMemoryException.
> >
> > Yeah, deep pagination in Lucene/Solr can be problematic due to the Priority 
> Queue management.  See http://issues.apache.org/jira/browse/LUCENE-2127 and the 
> linked discussion on java-dev.
> >
> >>
> >> I use highlight, faceting on nontokenized "Country" field, standard handler.
> >>
> >>
> >> It even seems to be a bug...
> >>
> >>
> >> Fuad Efendi
> >> +1 416-993-2060
> >> http://www.linkedin.com/in/liferay
> >>
> >> Tokenizer Inc.
> >> http://www.tokenizer.ca/
> >> Data Mining, Vertical Search
> >>
> >>
> >>
> >>
> >
> > --------------------------
> > Grant Ingersoll
> > http://www.lucidimagination.com/
> >
> > Search the Lucene ecosystem using Solr/Lucene: 
> http://www.lucidimagination.com/search
> >
> >
> 
> 
> 
> -- 
> Peter M. Wolanin, Ph.D.
> Momentum Specialist,  Acquia. Inc.
> peter.wolanin@acquia.com