You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by 王建新 <li...@gmail.com> on 2008/07/24 12:00:28 UTC

How to use lucene for high search performance ?

Hi,
    If I use lucene to execute many search requests at one time, the io operation will be the bottleneck of the performance.
    So I use RAMDirectory to avoid io operation.
    But I found RAMDirectory cannot raise the performance much if the index is big( about 1.2G).
    Could anyone give me any advice to raise the performance for concurrent search operation?
    Thanks.
    
roy

Re: How to use lucene for high search performance ?

Posted by Michael McCandless <lu...@mikemccandless.com>.
Yes you can, and that should be fast.

Another thing to try is an SSD -- look at the "Lucene performance  
issues" thread on java-user.

Mike

On Jul 27, 2008, at 11:54 PM, 王建新 wrote:

> Thanks a lot.
>
> I have an idea, Can I use lucene on a 64bits VM?
> In the condition, I can load all index files to ram. Then no io  
> operation, I can execute concurrent search in thread pool.
>
> Its performance will be better?
>
>
>
> ----- Original Message -----
> From: "Michael McCandless" <lu...@mikemccandless.com>
> To: <ge...@lucene.apache.org>
> Cc: <ja...@lucene.apache.org>
> Sent: Saturday, July 26, 2008 4:59 AM
> Subject: Re: How to use lucene for high search performance ?
>
>
>
> Let's move this thread to java-user (CC'd).
>
> 王建新 wrote:
>
>> Thank you.
>>
>> If the index files are very big(10G), I cannot load them to ram in
>> one process.
>
> Ahh OK.
>
>> Shoud I use MutilSearcher to load index files with serval processes?
>> How about its performance?
>
> MultiSearcher alone doesn't really scale up -- it just lets you
> combine the results of many Searchables.
>
> Maybe you mean ParallelMlultiSearcher?  That class uses a separate
> thread to search each Searchable, so if you are on a multi core/cpu
> machine that should give a net reduction in latency of each search
> (though I don't have any experience here!).
>
>> by the way, I think only .frq and .tis files need to load in ram.
>> And it can save some ram.
>
> You mean you don't use any positions information?  Really the OS
> should do the right thing for you -- it should only cache into its IO
> cache those files that you actually use after which searches should be
> fast.
>
> Mike
>
>>
>> roy
>>
>> ----- Original Message -----
>> From: "Michael McCandless" <lu...@mikemccandless.com>
>> To: <ge...@lucene.apache.org>
>> Sent: Thursday, July 24, 2008 6:09 PM
>> Subject: Re: How to use lucene for high search performance ?
>>
>>
>>
>> Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?
>>
>> It consumes more RAM than the RAMDirectory approach, but is faster
>> performance.
>>
>> Mike
>>
>> PS -- this sort of question should go to java-user in the future.
>>
>> 王建新 wrote:
>>
>>>
>>> Hi,
>>>  If I use lucene to execute many search requests at one time, the
>>> io operation will be the bottleneck of the performance.
>>>  So I use RAMDirectory to avoid io operation.
>>>  But I found RAMDirectory cannot raise the performance much if the
>>> index is big( about 1.2G).
>>>  Could anyone give me any advice to raise the performance for
>>> concurrent search operation?
>>>  Thanks.
>>>
>>> roy


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to use lucene for high search performance ?

Posted by Michael McCandless <lu...@mikemccandless.com>.
Yes you can, and that should be fast.

Another thing to try is an SSD -- look at the "Lucene performance  
issues" thread on java-user.

Mike

On Jul 27, 2008, at 11:54 PM, 王建新 wrote:

> Thanks a lot.
>
> I have an idea, Can I use lucene on a 64bits VM?
> In the condition, I can load all index files to ram. Then no io  
> operation, I can execute concurrent search in thread pool.
>
> Its performance will be better?
>
>
>
> ----- Original Message -----
> From: "Michael McCandless" <lu...@mikemccandless.com>
> To: <ge...@lucene.apache.org>
> Cc: <ja...@lucene.apache.org>
> Sent: Saturday, July 26, 2008 4:59 AM
> Subject: Re: How to use lucene for high search performance ?
>
>
>
> Let's move this thread to java-user (CC'd).
>
> 王建新 wrote:
>
>> Thank you.
>>
>> If the index files are very big(10G), I cannot load them to ram in
>> one process.
>
> Ahh OK.
>
>> Shoud I use MutilSearcher to load index files with serval processes?
>> How about its performance?
>
> MultiSearcher alone doesn't really scale up -- it just lets you
> combine the results of many Searchables.
>
> Maybe you mean ParallelMlultiSearcher?  That class uses a separate
> thread to search each Searchable, so if you are on a multi core/cpu
> machine that should give a net reduction in latency of each search
> (though I don't have any experience here!).
>
>> by the way, I think only .frq and .tis files need to load in ram.
>> And it can save some ram.
>
> You mean you don't use any positions information?  Really the OS
> should do the right thing for you -- it should only cache into its IO
> cache those files that you actually use after which searches should be
> fast.
>
> Mike
>
>>
>> roy
>>
>> ----- Original Message -----
>> From: "Michael McCandless" <lu...@mikemccandless.com>
>> To: <ge...@lucene.apache.org>
>> Sent: Thursday, July 24, 2008 6:09 PM
>> Subject: Re: How to use lucene for high search performance ?
>>
>>
>>
>> Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?
>>
>> It consumes more RAM than the RAMDirectory approach, but is faster
>> performance.
>>
>> Mike
>>
>> PS -- this sort of question should go to java-user in the future.
>>
>> 王建新 wrote:
>>
>>>
>>> Hi,
>>>  If I use lucene to execute many search requests at one time, the
>>> io operation will be the bottleneck of the performance.
>>>  So I use RAMDirectory to avoid io operation.
>>>  But I found RAMDirectory cannot raise the performance much if the
>>> index is big( about 1.2G).
>>>  Could anyone give me any advice to raise the performance for
>>> concurrent search operation?
>>>  Thanks.
>>>
>>> roy


Re: How to use lucene for high search performance ?

Posted by 王建新 <li...@gmail.com>.
Thanks a lot.

I have an idea, Can I use lucene on a 64bits VM?
In the condition, I can load all index files to ram. Then no io operation, I can execute concurrent search in thread pool.

Its performance will be better?



----- Original Message ----- 
From: "Michael McCandless" <lu...@mikemccandless.com>
To: <ge...@lucene.apache.org>
Cc: <ja...@lucene.apache.org>
Sent: Saturday, July 26, 2008 4:59 AM
Subject: Re: How to use lucene for high search performance ?



Let's move this thread to java-user (CC'd).

王建新 wrote:

> Thank you.
>
> If the index files are very big(10G), I cannot load them to ram in  
> one process.

Ahh OK.

> Shoud I use MutilSearcher to load index files with serval processes?
> How about its performance?

MultiSearcher alone doesn't really scale up -- it just lets you  
combine the results of many Searchables.

Maybe you mean ParallelMlultiSearcher?  That class uses a separate  
thread to search each Searchable, so if you are on a multi core/cpu  
machine that should give a net reduction in latency of each search  
(though I don't have any experience here!).

> by the way, I think only .frq and .tis files need to load in ram.
> And it can save some ram.

You mean you don't use any positions information?  Really the OS  
should do the right thing for you -- it should only cache into its IO  
cache those files that you actually use after which searches should be  
fast.

Mike

>
> roy
>
> ----- Original Message -----
> From: "Michael McCandless" <lu...@mikemccandless.com>
> To: <ge...@lucene.apache.org>
> Sent: Thursday, July 24, 2008 6:09 PM
> Subject: Re: How to use lucene for high search performance ?
>
>
>
> Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?
>
> It consumes more RAM than the RAMDirectory approach, but is faster
> performance.
>
> Mike
>
> PS -- this sort of question should go to java-user in the future.
>
> 王建新 wrote:
>
>>
>> Hi,
>>   If I use lucene to execute many search requests at one time, the
>> io operation will be the bottleneck of the performance.
>>   So I use RAMDirectory to avoid io operation.
>>   But I found RAMDirectory cannot raise the performance much if the
>> index is big( about 1.2G).
>>   Could anyone give me any advice to raise the performance for
>> concurrent search operation?
>>   Thanks.
>>
>> roy

Re: How to use lucene for high search performance ?

Posted by Michael McCandless <lu...@mikemccandless.com>.
Let's move this thread to java-user (CC'd).

王建新 wrote:

> Thank you.
>
> If the index files are very big(10G), I cannot load them to ram in  
> one process.

Ahh OK.

> Shoud I use MutilSearcher to load index files with serval processes?
> How about its performance?

MultiSearcher alone doesn't really scale up -- it just lets you  
combine the results of many Searchables.

Maybe you mean ParallelMlultiSearcher?  That class uses a separate  
thread to search each Searchable, so if you are on a multi core/cpu  
machine that should give a net reduction in latency of each search  
(though I don't have any experience here!).

> by the way, I think only .frq and .tis files need to load in ram.
> And it can save some ram.

You mean you don't use any positions information?  Really the OS  
should do the right thing for you -- it should only cache into its IO  
cache those files that you actually use after which searches should be  
fast.

Mike

>
> roy
>
> ----- Original Message -----
> From: "Michael McCandless" <lu...@mikemccandless.com>
> To: <ge...@lucene.apache.org>
> Sent: Thursday, July 24, 2008 6:09 PM
> Subject: Re: How to use lucene for high search performance ?
>
>
>
> Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?
>
> It consumes more RAM than the RAMDirectory approach, but is faster
> performance.
>
> Mike
>
> PS -- this sort of question should go to java-user in the future.
>
> 王建新 wrote:
>
>>
>> Hi,
>>   If I use lucene to execute many search requests at one time, the
>> io operation will be the bottleneck of the performance.
>>   So I use RAMDirectory to avoid io operation.
>>   But I found RAMDirectory cannot raise the performance much if the
>> index is big( about 1.2G).
>>   Could anyone give me any advice to raise the performance for
>> concurrent search operation?
>>   Thanks.
>>
>> roy


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to use lucene for high search performance ?

Posted by Michael McCandless <lu...@mikemccandless.com>.
Let's move this thread to java-user (CC'd).

王建新 wrote:

> Thank you.
>
> If the index files are very big(10G), I cannot load them to ram in  
> one process.

Ahh OK.

> Shoud I use MutilSearcher to load index files with serval processes?
> How about its performance?

MultiSearcher alone doesn't really scale up -- it just lets you  
combine the results of many Searchables.

Maybe you mean ParallelMlultiSearcher?  That class uses a separate  
thread to search each Searchable, so if you are on a multi core/cpu  
machine that should give a net reduction in latency of each search  
(though I don't have any experience here!).

> by the way, I think only .frq and .tis files need to load in ram.
> And it can save some ram.

You mean you don't use any positions information?  Really the OS  
should do the right thing for you -- it should only cache into its IO  
cache those files that you actually use after which searches should be  
fast.

Mike

>
> roy
>
> ----- Original Message -----
> From: "Michael McCandless" <lu...@mikemccandless.com>
> To: <ge...@lucene.apache.org>
> Sent: Thursday, July 24, 2008 6:09 PM
> Subject: Re: How to use lucene for high search performance ?
>
>
>
> Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?
>
> It consumes more RAM than the RAMDirectory approach, but is faster
> performance.
>
> Mike
>
> PS -- this sort of question should go to java-user in the future.
>
> 王建新 wrote:
>
>>
>> Hi,
>>   If I use lucene to execute many search requests at one time, the
>> io operation will be the bottleneck of the performance.
>>   So I use RAMDirectory to avoid io operation.
>>   But I found RAMDirectory cannot raise the performance much if the
>> index is big( about 1.2G).
>>   Could anyone give me any advice to raise the performance for
>> concurrent search operation?
>>   Thanks.
>>
>> roy


Re: How to use lucene for high search performance ?

Posted by 王建新 <li...@gmail.com>.
Thank you.

If the index files are very big(10G), I cannot load them to ram in one process.

Shoud I use MutilSearcher to load index files with serval processes?
How about its performance?

by the way, I think only .frq and .tis files need to load in ram.
And it can save some ram.

roy

----- Original Message ----- 
From: "Michael McCandless" <lu...@mikemccandless.com>
To: <ge...@lucene.apache.org>
Sent: Thursday, July 24, 2008 6:09 PM
Subject: Re: How to use lucene for high search performance ?



Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?

It consumes more RAM than the RAMDirectory approach, but is faster  
performance.

Mike

PS -- this sort of question should go to java-user in the future.

王建新 wrote:

>
> Hi,
>    If I use lucene to execute many search requests at one time, the  
> io operation will be the bottleneck of the performance.
>    So I use RAMDirectory to avoid io operation.
>    But I found RAMDirectory cannot raise the performance much if the  
> index is big( about 1.2G).
>    Could anyone give me any advice to raise the performance for  
> concurrent search operation?
>    Thanks.
>
> roy

Re: How to use lucene for high search performance ?

Posted by Michael McCandless <lu...@mikemccandless.com>.
Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?

It consumes more RAM than the RAMDirectory approach, but is faster  
performance.

Mike

PS -- this sort of question should go to java-user in the future.

王建新 wrote:

>
> Hi,
>    If I use lucene to execute many search requests at one time, the  
> io operation will be the bottleneck of the performance.
>    So I use RAMDirectory to avoid io operation.
>    But I found RAMDirectory cannot raise the performance much if the  
> index is big( about 1.2G).
>    Could anyone give me any advice to raise the performance for  
> concurrent search operation?
>    Thanks.
>
> roy