You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Zhang, Lisheng" <Li...@BroadVision.com> on 2010/09/24 03:10:37 UTC

In lucene 2.3.2, needs to stop optimization?

Hi,

We are using lucene 2.3.2, now we need to index each document as
fast as possible, so user can almost immediately search it. 

So I am considering stop IndexWriter optimization during real time, 
then in relatively off-time like late night we may call IndexWriter optimize
method explicitly once.

What is the most efficient way to completely turn off IndexWriter merge
in lucene 2.3.2?

Thanks very much for helps, Lisheng

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: In lucene 2.3.2, needs to stop optimization?

Posted by "Zhang, Lisheng" <Li...@BroadVision.com>.
Hi,

Thanks very much, I definitely plan to upgrade lucene.

I did not keep IndexWriter open partly because in our app we
have more than 3K independent lucene directories, so it would
be hard to put them all into memory, but I may cache some
busiest ones.

Best regards, Lisheng

-----Original Message-----
From: Danil TORIN [mailto:torindan@gmail.com]
Sent: Friday, September 24, 2010 12:01 AM
To: java-user@lucene.apache.org
Subject: Re: In lucene 2.3.2, needs to stop optimization?


Is it possible for you to migrate to 2.9.x ? Or even 3.x ?
There are some huge optimization in 2.9 on reopening indexes that
significantly improve search speed.

I'm not sure..but I think indexWriter.getReader() for almost realtime
was added to 2.9, so you can keep your writer always open and get very
cheaply a new reader on each search request.

On Fri, Sep 24, 2010 at 09:47, Zhang, Lisheng
<Li...@broadvision.com> wrote:
> Hi,
>
> I read document/code and did some experiments, one possibility
> is to raise mergeFactor to high value, say close to 2Billion,
> then a lot of small files are created and after >500 docs are
> created separately, search speed dropped sharply.
>
> I noticed with our current data, if I add one doc then call
> optimize(), it took about 7s, this is too slow for real time
> search.
>
> If I keep mergeFactor as 10 and donot call optimize(), does it
> mean from time to time IndexWriter would optimize on background,
> when it happens, it may take a few seconds (so Index will delay
> a few seconds)?
>
> Should I use high mergeFactor and optimize once a day, or use
> default mergeFactor and donot call optimize? maybe latter is
> better, but I am concerned about occasional slowness?
>
> Currently I donot plan to keep IndexWriter constantly open, but
> open/close for each index request.
>
> Any suggestion to improve would be appreciated,
>
> Lisheng
>
> -----Original Message-----
> From: Zhang, Lisheng [mailto:Lisheng.Zhang@BroadVision.com]
> Sent: Thursday, September 23, 2010 6:11 PM
> To: java-user@lucene.apache.org
> Subject: In lucene 2.3.2, needs to stop optimization?
>
>
> Hi,
>
> We are using lucene 2.3.2, now we need to index each document as
> fast as possible, so user can almost immediately search it.
>
> So I am considering stop IndexWriter optimization during real time,
> then in relatively off-time like late night we may call IndexWriter optimize
> method explicitly once.
>
> What is the most efficient way to completely turn off IndexWriter merge
> in lucene 2.3.2?
>
> Thanks very much for helps, Lisheng
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: In lucene 2.3.2, needs to stop optimization?

Posted by Danil Ε’ORIN <to...@gmail.com>.
Is it possible for you to migrate to 2.9.x ? Or even 3.x ?
There are some huge optimization in 2.9 on reopening indexes that
significantly improve search speed.

I'm not sure..but I think indexWriter.getReader() for almost realtime
was added to 2.9, so you can keep your writer always open and get very
cheaply a new reader on each search request.

On Fri, Sep 24, 2010 at 09:47, Zhang, Lisheng
<Li...@broadvision.com> wrote:
> Hi,
>
> I read document/code and did some experiments, one possibility
> is to raise mergeFactor to high value, say close to 2Billion,
> then a lot of small files are created and after >500 docs are
> created separately, search speed dropped sharply.
>
> I noticed with our current data, if I add one doc then call
> optimize(), it took about 7s, this is too slow for real time
> search.
>
> If I keep mergeFactor as 10 and donot call optimize(), does it
> mean from time to time IndexWriter would optimize on background,
> when it happens, it may take a few seconds (so Index will delay
> a few seconds)?
>
> Should I use high mergeFactor and optimize once a day, or use
> default mergeFactor and donot call optimize? maybe latter is
> better, but I am concerned about occasional slowness?
>
> Currently I donot plan to keep IndexWriter constantly open, but
> open/close for each index request.
>
> Any suggestion to improve would be appreciated,
>
> Lisheng
>
> -----Original Message-----
> From: Zhang, Lisheng [mailto:Lisheng.Zhang@BroadVision.com]
> Sent: Thursday, September 23, 2010 6:11 PM
> To: java-user@lucene.apache.org
> Subject: In lucene 2.3.2, needs to stop optimization?
>
>
> Hi,
>
> We are using lucene 2.3.2, now we need to index each document as
> fast as possible, so user can almost immediately search it.
>
> So I am considering stop IndexWriter optimization during real time,
> then in relatively off-time like late night we may call IndexWriter optimize
> method explicitly once.
>
> What is the most efficient way to completely turn off IndexWriter merge
> in lucene 2.3.2?
>
> Thanks very much for helps, Lisheng
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: In lucene 2.3.2, needs to stop optimization?

Posted by "Zhang, Lisheng" <Li...@BroadVision.com>.
Hi,

I read document/code and did some experiments, one possibility
is to raise mergeFactor to high value, say close to 2Billion,
then a lot of small files are created and after >500 docs are
created separately, search speed dropped sharply.

I noticed with our current data, if I add one doc then call 
optimize(), it took about 7s, this is too slow for real time
search. 

If I keep mergeFactor as 10 and donot call optimize(), does it
mean from time to time IndexWriter would optimize on background,
when it happens, it may take a few seconds (so Index will delay
a few seconds)?

Should I use high mergeFactor and optimize once a day, or use
default mergeFactor and donot call optimize? maybe latter is
better, but I am concerned about occasional slowness?

Currently I donot plan to keep IndexWriter constantly open, but
open/close for each index request.

Any suggestion to improve would be appreciated,

Lisheng 

-----Original Message-----
From: Zhang, Lisheng [mailto:Lisheng.Zhang@BroadVision.com]
Sent: Thursday, September 23, 2010 6:11 PM
To: java-user@lucene.apache.org
Subject: In lucene 2.3.2, needs to stop optimization?


Hi,

We are using lucene 2.3.2, now we need to index each document as
fast as possible, so user can almost immediately search it. 

So I am considering stop IndexWriter optimization during real time, 
then in relatively off-time like late night we may call IndexWriter optimize
method explicitly once.

What is the most efficient way to completely turn off IndexWriter merge
in lucene 2.3.2?

Thanks very much for helps, Lisheng

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org