You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Qiurun <qi...@huawei.com> on 2011/12/20 14:45:45 UTC

Help: About performance of search with sorting.

Dear all,

I select some of docs that meet some criteria by using TopDocs search(Query query, int n). Also It's easy to select the docs that meet some query and sort by some field by using TopFieldDocs search(Query query, int n, Sort sort). As known, Lucene use field cache when sorting results by field values. According to Lucene in action (second editon), "The first time the field cache is accessed for a given reader and field, the values for all documents are visited and loaded into memory as a single large array, and recorded into an internal cache keyed on the reader instance and the field name.  This process can be quite time consuming, for a large index.", "FieldCache does not clear its entries until you close your reader and remove all references to that reader from your application."

Now we have an index with about 200 millions docs in it. However, we can not find the obvious performance difference between the two ways. And I want to know why, thanks for your advice.

(We are using Lucene 3.2.0 and java version 1.6.0_26 on SuSe Linux Enterprise Server 10 sp2.)

Best,
Qiu Run



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Help: About performance of search with sorting.

Posted by Erick Erickson <er...@gmail.com>.

What are you specifying for your sort criteria? And what kind of field
is it we're talking
about here?

Best
Erick

On Tue, Dec 20, 2011 at 8:45 AM, Qiurun <qi...@huawei.com> wrote:
> Dear all,
>
> I select some of docs that meet some criteria by using TopDocs search(Query query, int n). Also It's easy to select the docs that meet some query and sort by some field by using TopFieldDocs search(Query query, int n, Sort sort). As known, Lucene use field cache when sorting results by field values. According to Lucene in action (second editon), "The first time the field cache is accessed for a given reader and field, the values for all documents are visited and loaded into memory as a single large array, and recorded into an internal cache keyed on the reader instance and the field name.  This process can be quite time consuming, for a large index.", "FieldCache does not clear its entries until you close your reader and remove all references to that reader from your application."
>
> Now we have an index with about 200 millions docs in it. However, we can not find the obvious performance difference between the two ways. And I want to know why, thanks for your advice.
>
> (We are using Lucene 3.2.0 and java version 1.6.0_26 on SuSe Linux Enterprise Server 10 sp2.)
>
> Best,
> Qiu Run
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org