You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Trevor Watson <tw...@datassimilate.com> on 2011/01/24 19:58:20 UTC

Optimization times

Just a quick question regarding using the Optimize function in Lucene.NET.

Is it more time efficient to call Optimize occasionally while adding 
documents to an index, or is it better to call it at the end of adding 
documents only?

The index we are creating has a possible 2-3 million records added at a 
time and we currently optimize every 100,000.

Thanks in advance.

Trevor Watson

RE: Optimization times

Posted by Jean-Francois Beaulac <je...@hotmail.com>.
>From the Lucene FAQ:

http://wiki.apache.org/lucene-java/LuceneFAQ#What_is_index_optimization_and_when_should_I_use_it.3F
jf

> Date: Mon, 24 Jan 2011 13:58:20 -0500
> From: twatson@datassimilate.com
> To: lucene-net-user@lucene.apache.org
> Subject: Optimization times
> 
> Just a quick question regarding using the Optimize function in Lucene.NET.
> 
> Is it more time efficient to call Optimize occasionally while adding 
> documents to an index, or is it better to call it at the end of adding 
> documents only?
> 
> The index we are creating has a possible 2-3 million records added at a 
> time and we currently optimize every 100,000.
> 
> Thanks in advance.
> 
> Trevor Watson
 		 	   		  

Re: Optimization times

Posted by Trevor Watson <tw...@datassimilate.com>.
Thanks to all for the speedy and useful replies!

On 01/24/2011 2:20 PM, Kevin Miller wrote:
> I battled with How To Optimize Lucene a bit. Make sure you understand why
> you are calling Optimize. You only need to optimize to improve search
> performance (by limiting the number of index files which get traversed when
> collecting hits). The index writer will "optimize" automatically based on
> your MergeFactor settings. IMHO your in-flight optimizations are likely
> unnecessary. I also found that it was best to use the Optimize(int) overload
> which to select the minimum number of index files (sorry I forgot the
> correct term) which sped up optimization quite a lot as getting a large
> index into a single index file can be quite time consuming.
>
> I found this post very educational regarding this subject:
> http://tim.oreilly.com/pub/a/onjava/2003/03/05/lucene.html?page=1
>
> On Mon, Jan 24, 2011 at 12:58 PM, Trevor Watson
> <tw...@datassimilate.com>wrote:
>
>> Just a quick question regarding using the Optimize function in Lucene.NET.
>>
>> Is it more time efficient to call Optimize occasionally while adding
>> documents to an index, or is it better to call it at the end of adding
>> documents only?
>>
>> The index we are creating has a possible 2-3 million records added at a
>> time and we currently optimize every 100,000.
>>
>> Thanks in advance.
>>
>> Trevor Watson
>>


Re: Optimization times

Posted by Kevin Miller <sc...@gmail.com>.
I battled with How To Optimize Lucene a bit. Make sure you understand why
you are calling Optimize. You only need to optimize to improve search
performance (by limiting the number of index files which get traversed when
collecting hits). The index writer will "optimize" automatically based on
your MergeFactor settings. IMHO your in-flight optimizations are likely
unnecessary. I also found that it was best to use the Optimize(int) overload
which to select the minimum number of index files (sorry I forgot the
correct term) which sped up optimization quite a lot as getting a large
index into a single index file can be quite time consuming.

I found this post very educational regarding this subject:
http://tim.oreilly.com/pub/a/onjava/2003/03/05/lucene.html?page=1

On Mon, Jan 24, 2011 at 12:58 PM, Trevor Watson
<tw...@datassimilate.com>wrote:

> Just a quick question regarding using the Optimize function in Lucene.NET.
>
> Is it more time efficient to call Optimize occasionally while adding
> documents to an index, or is it better to call it at the end of adding
> documents only?
>
> The index we are creating has a possible 2-3 million records added at a
> time and we currently optimize every 100,000.
>
> Thanks in advance.
>
> Trevor Watson
>

RE: Optimization times

Posted by Frank Yu <fr...@farpoint.com>.
Karnav,

It would be better for you to open a new thread for your own question. I am
not sure if I got what you meant by partially updating the index files. 

Thanks,

Frank

-----Original Message-----
From: K a r n a v [mailto:karunakerreddyv@gmail.com] 
Sent: Monday, January 24, 2011 9:41 PM
To: lucene-net-user@lucene.apache.org
Subject: Re: Optimization times

how can I partially update the index files....
..
I mean partial indexing logic required for me...
could anyone please help me...

On Tue, Jan 25, 2011 at 12:28 AM, Trevor Watson
<tw...@datassimilate.com>wrote:

> Just a quick question regarding using the Optimize function in Lucene.NET.
>
> Is it more time efficient to call Optimize occasionally while adding
> documents to an index, or is it better to call it at the end of adding
> documents only?
>
> The index we are creating has a possible 2-3 million records added at a
> time and we currently optimize every 100,000.
>
> Thanks in advance.
>
> Trevor Watson
>



-- 
*Thanks & Regards*,
*Karunaker Reddy V
*



Re: Optimization times

Posted by K a r n a v <ka...@gmail.com>.
how can I partially update the index files....
..
I mean partial indexing logic required for me...
could anyone please help me...

On Tue, Jan 25, 2011 at 12:28 AM, Trevor Watson
<tw...@datassimilate.com>wrote:

> Just a quick question regarding using the Optimize function in Lucene.NET.
>
> Is it more time efficient to call Optimize occasionally while adding
> documents to an index, or is it better to call it at the end of adding
> documents only?
>
> The index we are creating has a possible 2-3 million records added at a
> time and we currently optimize every 100,000.
>
> Thanks in advance.
>
> Trevor Watson
>



-- 
*Thanks & Regards*,
*Karunaker Reddy V
*

RE: Optimization times

Posted by Digy <di...@gmail.com>.
With 2.9.2 you don't have to optimize at all. 
DIGY

-----Original Message-----
From: Trevor Watson [mailto:twatson@datassimilate.com] 
Sent: Monday, January 24, 2011 8:58 PM
To: lucene-net-user@lucene.apache.org
Subject: Optimization times

Just a quick question regarding using the Optimize function in Lucene.NET.

Is it more time efficient to call Optimize occasionally while adding 
documents to an index, or is it better to call it at the end of adding 
documents only?

The index we are creating has a possible 2-3 million records added at a 
time and we currently optimize every 100,000.

Thanks in advance.

Trevor Watson