You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Michael McCandless <lu...@mikemccandless.com> on 2008/07/25 22:59:50 UTC

Re: How to use lucene for high search performance ?

Let's move this thread to java-user (CC'd).

王建新 wrote:

> Thank you.
>
> If the index files are very big(10G), I cannot load them to ram in  
> one process.

Ahh OK.

> Shoud I use MutilSearcher to load index files with serval processes?
> How about its performance?

MultiSearcher alone doesn't really scale up -- it just lets you  
combine the results of many Searchables.

Maybe you mean ParallelMlultiSearcher?  That class uses a separate  
thread to search each Searchable, so if you are on a multi core/cpu  
machine that should give a net reduction in latency of each search  
(though I don't have any experience here!).

> by the way, I think only .frq and .tis files need to load in ram.
> And it can save some ram.

You mean you don't use any positions information?  Really the OS  
should do the right thing for you -- it should only cache into its IO  
cache those files that you actually use after which searches should be  
fast.

Mike

>
> roy
>
> ----- Original Message -----
> From: "Michael McCandless" <lu...@mikemccandless.com>
> To: <ge...@lucene.apache.org>
> Sent: Thursday, July 24, 2008 6:09 PM
> Subject: Re: How to use lucene for high search performance ?
>
>
>
> Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?
>
> It consumes more RAM than the RAMDirectory approach, but is faster
> performance.
>
> Mike
>
> PS -- this sort of question should go to java-user in the future.
>
> 王建新 wrote:
>
>>
>> Hi,
>>   If I use lucene to execute many search requests at one time, the
>> io operation will be the bottleneck of the performance.
>>   So I use RAMDirectory to avoid io operation.
>>   But I found RAMDirectory cannot raise the performance much if the
>> index is big( about 1.2G).
>>   Could anyone give me any advice to raise the performance for
>> concurrent search operation?
>>   Thanks.
>>
>> roy


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to use lucene for high search performance ?

Posted by Michael McCandless <lu...@mikemccandless.com>.
Yes you can, and that should be fast.

Another thing to try is an SSD -- look at the "Lucene performance  
issues" thread on java-user.

Mike

On Jul 27, 2008, at 11:54 PM, 王建新 wrote:

> Thanks a lot.
>
> I have an idea, Can I use lucene on a 64bits VM?
> In the condition, I can load all index files to ram. Then no io  
> operation, I can execute concurrent search in thread pool.
>
> Its performance will be better?
>
>
>
> ----- Original Message -----
> From: "Michael McCandless" <lu...@mikemccandless.com>
> To: <ge...@lucene.apache.org>
> Cc: <ja...@lucene.apache.org>
> Sent: Saturday, July 26, 2008 4:59 AM
> Subject: Re: How to use lucene for high search performance ?
>
>
>
> Let's move this thread to java-user (CC'd).
>
> 王建新 wrote:
>
>> Thank you.
>>
>> If the index files are very big(10G), I cannot load them to ram in
>> one process.
>
> Ahh OK.
>
>> Shoud I use MutilSearcher to load index files with serval processes?
>> How about its performance?
>
> MultiSearcher alone doesn't really scale up -- it just lets you
> combine the results of many Searchables.
>
> Maybe you mean ParallelMlultiSearcher?  That class uses a separate
> thread to search each Searchable, so if you are on a multi core/cpu
> machine that should give a net reduction in latency of each search
> (though I don't have any experience here!).
>
>> by the way, I think only .frq and .tis files need to load in ram.
>> And it can save some ram.
>
> You mean you don't use any positions information?  Really the OS
> should do the right thing for you -- it should only cache into its IO
> cache those files that you actually use after which searches should be
> fast.
>
> Mike
>
>>
>> roy
>>
>> ----- Original Message -----
>> From: "Michael McCandless" <lu...@mikemccandless.com>
>> To: <ge...@lucene.apache.org>
>> Sent: Thursday, July 24, 2008 6:09 PM
>> Subject: Re: How to use lucene for high search performance ?
>>
>>
>>
>> Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?
>>
>> It consumes more RAM than the RAMDirectory approach, but is faster
>> performance.
>>
>> Mike
>>
>> PS -- this sort of question should go to java-user in the future.
>>
>> 王建新 wrote:
>>
>>>
>>> Hi,
>>>  If I use lucene to execute many search requests at one time, the
>>> io operation will be the bottleneck of the performance.
>>>  So I use RAMDirectory to avoid io operation.
>>>  But I found RAMDirectory cannot raise the performance much if the
>>> index is big( about 1.2G).
>>>  Could anyone give me any advice to raise the performance for
>>> concurrent search operation?
>>>  Thanks.
>>>
>>> roy


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to use lucene for high search performance ?

Posted by Michael McCandless <lu...@mikemccandless.com>.
Yes you can, and that should be fast.

Another thing to try is an SSD -- look at the "Lucene performance  
issues" thread on java-user.

Mike

On Jul 27, 2008, at 11:54 PM, 王建新 wrote:

> Thanks a lot.
>
> I have an idea, Can I use lucene on a 64bits VM?
> In the condition, I can load all index files to ram. Then no io  
> operation, I can execute concurrent search in thread pool.
>
> Its performance will be better?
>
>
>
> ----- Original Message -----
> From: "Michael McCandless" <lu...@mikemccandless.com>
> To: <ge...@lucene.apache.org>
> Cc: <ja...@lucene.apache.org>
> Sent: Saturday, July 26, 2008 4:59 AM
> Subject: Re: How to use lucene for high search performance ?
>
>
>
> Let's move this thread to java-user (CC'd).
>
> 王建新 wrote:
>
>> Thank you.
>>
>> If the index files are very big(10G), I cannot load them to ram in
>> one process.
>
> Ahh OK.
>
>> Shoud I use MutilSearcher to load index files with serval processes?
>> How about its performance?
>
> MultiSearcher alone doesn't really scale up -- it just lets you
> combine the results of many Searchables.
>
> Maybe you mean ParallelMlultiSearcher?  That class uses a separate
> thread to search each Searchable, so if you are on a multi core/cpu
> machine that should give a net reduction in latency of each search
> (though I don't have any experience here!).
>
>> by the way, I think only .frq and .tis files need to load in ram.
>> And it can save some ram.
>
> You mean you don't use any positions information?  Really the OS
> should do the right thing for you -- it should only cache into its IO
> cache those files that you actually use after which searches should be
> fast.
>
> Mike
>
>>
>> roy
>>
>> ----- Original Message -----
>> From: "Michael McCandless" <lu...@mikemccandless.com>
>> To: <ge...@lucene.apache.org>
>> Sent: Thursday, July 24, 2008 6:09 PM
>> Subject: Re: How to use lucene for high search performance ?
>>
>>
>>
>> Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?
>>
>> It consumes more RAM than the RAMDirectory approach, but is faster
>> performance.
>>
>> Mike
>>
>> PS -- this sort of question should go to java-user in the future.
>>
>> 王建新 wrote:
>>
>>>
>>> Hi,
>>>  If I use lucene to execute many search requests at one time, the
>>> io operation will be the bottleneck of the performance.
>>>  So I use RAMDirectory to avoid io operation.
>>>  But I found RAMDirectory cannot raise the performance much if the
>>> index is big( about 1.2G).
>>>  Could anyone give me any advice to raise the performance for
>>> concurrent search operation?
>>>  Thanks.
>>>
>>> roy


Re: How to use lucene for high search performance ?

Posted by 王建新 <li...@gmail.com>.
Thanks a lot.

I have an idea, Can I use lucene on a 64bits VM?
In the condition, I can load all index files to ram. Then no io operation, I can execute concurrent search in thread pool.

Its performance will be better?



----- Original Message ----- 
From: "Michael McCandless" <lu...@mikemccandless.com>
To: <ge...@lucene.apache.org>
Cc: <ja...@lucene.apache.org>
Sent: Saturday, July 26, 2008 4:59 AM
Subject: Re: How to use lucene for high search performance ?



Let's move this thread to java-user (CC'd).

王建新 wrote:

> Thank you.
>
> If the index files are very big(10G), I cannot load them to ram in  
> one process.

Ahh OK.

> Shoud I use MutilSearcher to load index files with serval processes?
> How about its performance?

MultiSearcher alone doesn't really scale up -- it just lets you  
combine the results of many Searchables.

Maybe you mean ParallelMlultiSearcher?  That class uses a separate  
thread to search each Searchable, so if you are on a multi core/cpu  
machine that should give a net reduction in latency of each search  
(though I don't have any experience here!).

> by the way, I think only .frq and .tis files need to load in ram.
> And it can save some ram.

You mean you don't use any positions information?  Really the OS  
should do the right thing for you -- it should only cache into its IO  
cache those files that you actually use after which searches should be  
fast.

Mike

>
> roy
>
> ----- Original Message -----
> From: "Michael McCandless" <lu...@mikemccandless.com>
> To: <ge...@lucene.apache.org>
> Sent: Thursday, July 24, 2008 6:09 PM
> Subject: Re: How to use lucene for high search performance ?
>
>
>
> Try InstantiatedIndexWriter/Reader (under contrib/instantiated)?
>
> It consumes more RAM than the RAMDirectory approach, but is faster
> performance.
>
> Mike
>
> PS -- this sort of question should go to java-user in the future.
>
> 王建新 wrote:
>
>>
>> Hi,
>>   If I use lucene to execute many search requests at one time, the
>> io operation will be the bottleneck of the performance.
>>   So I use RAMDirectory to avoid io operation.
>>   But I found RAMDirectory cannot raise the performance much if the
>> index is big( about 1.2G).
>>   Could anyone give me any advice to raise the performance for
>> concurrent search operation?
>>   Thanks.
>>
>> roy