You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by manish gupta <to...@gmail.com> on 2018/05/07 12:44:12 UTC

Query on searchAfter API usage in IndexSearcher

 Hi Team,

I am new to Lucene and I am trying to use Lucene for text search in my
project to achieve better results in terms of query performance.

Initially I was facing lot of GC issues while using lucene as I was using
search API and passing all the documents count. As my data size is around 4
billion the number of documents created by Lucene were huge. Internally
search API uses TopScoreDocCollector which internally creates a
PriorityQueue of given documents count thus causing lot of GC.

*To avoid this problem I am trying to query using a pagination way wherein
I am query only 10 documents at a time and after that I am using
seacrhAfter API to query further passing the lastScoreDoc from previous
result. This has resolved the GC problem but the query time has increased
by a huge margin from 3 sec to 600 sec.*

*When I debugged I found that even though I use the searchAfter API, it is
not avoiding the IO and every time it is reading the data from disk again.
It is only skipping the results filled in previous search. Is my
understanding correct?. If yes please let me know if there is a better way
to query the results in incremental order so as to avoid GC and with
minimal impact on query performance.*

Regards
Manish Gupta

Re: Query on searchAfter API usage in IndexSearcher

Posted by Bryan Bende <bb...@gmail.com>.
Are you specifying a sort clause on your query?

I'm not totally sure, but I think having a sort clause might be a
requirement for efficient deep paging.

I know Solr's cursorMark feature uses the searchAfter API, and a
cursorMark is essentially the sort values of the last document from
the previous result:

https://github.com/apache/lucene-solr/blob/e30264b31400a147507aabd121b1152020b8aa6d/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1524-L1525
https://lucene.apache.org/solr/guide/7_3/pagination-of-results.html


On Wed, May 9, 2018 at 4:56 AM, Jacky Li <ja...@qq.com> wrote:
> I have encountered the same problem, I wonder if anyone know the solution?
>
> Regards,
> Jacky
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Query on searchAfter API usage in IndexSearcher

Posted by Jacky Li <ja...@qq.com>.
I have encountered the same problem, I wonder if anyone know the solution?

Regards,
Jacky



--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Query on searchAfter API usage in IndexSearcher

Posted by "tomanishgupta18@gmail.com" <to...@gmail.com>.
Hi Lucene Team,

Can you please reply to my query. Its a urgent issue and we need to resolve
it at the earliest.

Lucene Version used is 6.3.0 but even tried with the latest version 7.3.0.

Regards
Manish Gupta



--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org