You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by "Luis Fco. Ramriez Daza Glez" <lu...@yahoo.com.mx> on 2010/03/03 05:05:15 UTC

Commit / Optimize taking too long.

Hi all

 

We have a problem with some of our programs with the Commit and/or Close that takes too long.

We are using interactive indexing of a database, so the index is being constantly updated. The updates are done in the network, not in the PC where the index folder is physically located.

After the program that performs the updates is loaded the Commits takes less than a second. But after a few updates (about 10 – 15) the Commits start to take very long, up to 10 minutes for just one document added.

I think it could be because of the merges, that’s why it start taking long after 10 or 15 updates.

It is very strange because indexing our whole DB only takes less than 5 minutes, it has aprox 500,000records.

 

Also we are searching the index while the updates are being done, so there are always readers/Searchers open while the Writer is commiting. Could this be cause of the Commit taking that long?

If the cause of the problem are the merges, should I increase the MergeFactor (currently using the default of 10) or decrease it, so there are fewer merges.

Our index could be optimized every day so if the merges are not done while updating the index they can be done at the end of the day, so all merges are done.

 

Another question. I set the IndexWriter.SetInfostream(), but is has no timestamps, so it is very difficult to debug. Is there a way to force Lucene to add timestamps to the InfoStream?

 

Thanks for any help

 

Best regards

Luis

 

 


RE: Commit / Optimize taking too long.

Posted by Digy <di...@gmail.com>.
> we cannot afford to close the searchers/readers and open them at the moment of the search
No, what I meant was to use "IndexWriter.GetReader()" to get a new IndexReader. 
Since you will probably get a new IndexReader( and close the old one) when "indexReader.IsCurrent()==false" to be able to see the newly added docs, indexWriter can commit the changes.


DIGY

-----Original Message-----
From: Luis Fco Ramirez Daza Glez [mailto:luis.francisco.rdg@gmail.com] 
Sent: Wednesday, March 03, 2010 10:08 PM
To: lucene-net-user@lucene.apache.org
Subject: RE: Commit / Optimize taking too long.

The problem is that we have several users opening the index for search in the network at the same time. Most likely there is always at least one IndexReader open all the time on different machines.
All readers are closed, but we still need to keep the Searcher and its Reader(Searcher.reader) open all the time, we even pre warm up the search in each client so it can be the fastest search experience for the user, even if the user is not currently searching the Searcher is open and ready to be used as soon as the user wants, so we cannot afford to close the searchers/readers and open them at the moment of the search.
(The updates are done only in a single machine at one time.)

Thanks Digy

Best regards
Luis



> -----Original Message-----
> From: Digy [mailto:digydigy@gmail.com]
> Sent: Wednesday, March 03, 2010 12:21 PM
> To: lucene-net-user@lucene.apache.org
> Subject: RE: Commit / Optimize taking too long.
> 
> As you said, It seems like an open IndexReader blocking the
> IndexWriter. Can you try to open IndexReader as readonly or better use
> IndexWriter.GetIndexReader.
> 
> DIGY.
> 
> -----Original Message-----
> From: Luis Fco. Ramriez Daza Glez [mailto:luisfco_w@yahoo.com.mx]
> Sent: Wednesday, March 03, 2010 6:05 AM
> To: lucene-net-user@lucene.apache.org
> Subject: Commit / Optimize taking too long.
> 
> Hi all
> 
> 
> 
> We have a problem with some of our programs with the Commit and/or
> Close that takes too long.
> 
> We are using interactive indexing of a database, so the index is being
> constantly updated. The updates are done in the network, not in the PC
> where the index folder is physically located.
> 
> After the program that performs the updates is loaded the Commits takes
> less than a second. But after a few updates (about 10 – 15) the Commits
> start to take very long, up to 10 minutes for just one document added.
> 
> I think it could be because of the merges, that’s why it start taking
> long after 10 or 15 updates.
> 
> It is very strange because indexing our whole DB only takes less than 5
> minutes, it has aprox 500,000records.
> 
> 
> 
> Also we are searching the index while the updates are being done, so
> there are always readers/Searchers open while the Writer is commiting.
> Could this be cause of the Commit taking that long?
> 
> If the cause of the problem are the merges, should I increase the
> MergeFactor (currently using the default of 10) or decrease it, so
> there are fewer merges.
> 
> Our index could be optimized every day so if the merges are not done
> while updating the index they can be done at the end of the day, so all
> merges are done.
> 
> 
> 
> Another question. I set the IndexWriter.SetInfostream(), but is has no
> timestamps, so it is very difficult to debug. Is there a way to force
> Lucene to add timestamps to the InfoStream?
> 
> 
> 
> Thanks for any help
> 
> 
> 
> Best regards
> 
> Luis
> 
> 
> 
> 



RE: Commit / Optimize taking too long.

Posted by Luis Fco Ramirez Daza Glez <lu...@gmail.com>.
The problem is that we have several users opening the index for search in the network at the same time. Most likely there is always at least one IndexReader open all the time on different machines.
All readers are closed, but we still need to keep the Searcher and its Reader(Searcher.reader) open all the time, we even pre warm up the search in each client so it can be the fastest search experience for the user, even if the user is not currently searching the Searcher is open and ready to be used as soon as the user wants, so we cannot afford to close the searchers/readers and open them at the moment of the search.
(The updates are done only in a single machine at one time.)

Thanks Digy

Best regards
Luis



> -----Original Message-----
> From: Digy [mailto:digydigy@gmail.com]
> Sent: Wednesday, March 03, 2010 12:21 PM
> To: lucene-net-user@lucene.apache.org
> Subject: RE: Commit / Optimize taking too long.
> 
> As you said, It seems like an open IndexReader blocking the
> IndexWriter. Can you try to open IndexReader as readonly or better use
> IndexWriter.GetIndexReader.
> 
> DIGY.
> 
> -----Original Message-----
> From: Luis Fco. Ramriez Daza Glez [mailto:luisfco_w@yahoo.com.mx]
> Sent: Wednesday, March 03, 2010 6:05 AM
> To: lucene-net-user@lucene.apache.org
> Subject: Commit / Optimize taking too long.
> 
> Hi all
> 
> 
> 
> We have a problem with some of our programs with the Commit and/or
> Close that takes too long.
> 
> We are using interactive indexing of a database, so the index is being
> constantly updated. The updates are done in the network, not in the PC
> where the index folder is physically located.
> 
> After the program that performs the updates is loaded the Commits takes
> less than a second. But after a few updates (about 10 – 15) the Commits
> start to take very long, up to 10 minutes for just one document added.
> 
> I think it could be because of the merges, that’s why it start taking
> long after 10 or 15 updates.
> 
> It is very strange because indexing our whole DB only takes less than 5
> minutes, it has aprox 500,000records.
> 
> 
> 
> Also we are searching the index while the updates are being done, so
> there are always readers/Searchers open while the Writer is commiting.
> Could this be cause of the Commit taking that long?
> 
> If the cause of the problem are the merges, should I increase the
> MergeFactor (currently using the default of 10) or decrease it, so
> there are fewer merges.
> 
> Our index could be optimized every day so if the merges are not done
> while updating the index they can be done at the end of the day, so all
> merges are done.
> 
> 
> 
> Another question. I set the IndexWriter.SetInfostream(), but is has no
> timestamps, so it is very difficult to debug. Is there a way to force
> Lucene to add timestamps to the InfoStream?
> 
> 
> 
> Thanks for any help
> 
> 
> 
> Best regards
> 
> Luis
> 
> 
> 
> 



RE: Commit / Optimize taking too long.

Posted by Digy <di...@gmail.com>.
As you said, It seems like an open IndexReader blocking the IndexWriter. Can you try to open IndexReader as readonly or better use IndexWriter.GetIndexReader.

DIGY.

-----Original Message-----
From: Luis Fco. Ramriez Daza Glez [mailto:luisfco_w@yahoo.com.mx] 
Sent: Wednesday, March 03, 2010 6:05 AM
To: lucene-net-user@lucene.apache.org
Subject: Commit / Optimize taking too long.

Hi all

 

We have a problem with some of our programs with the Commit and/or Close that takes too long.

We are using interactive indexing of a database, so the index is being constantly updated. The updates are done in the network, not in the PC where the index folder is physically located.

After the program that performs the updates is loaded the Commits takes less than a second. But after a few updates (about 10 – 15) the Commits start to take very long, up to 10 minutes for just one document added.

I think it could be because of the merges, that’s why it start taking long after 10 or 15 updates.

It is very strange because indexing our whole DB only takes less than 5 minutes, it has aprox 500,000records.

 

Also we are searching the index while the updates are being done, so there are always readers/Searchers open while the Writer is commiting. Could this be cause of the Commit taking that long?

If the cause of the problem are the merges, should I increase the MergeFactor (currently using the default of 10) or decrease it, so there are fewer merges.

Our index could be optimized every day so if the merges are not done while updating the index they can be done at the end of the day, so all merges are done.

 

Another question. I set the IndexWriter.SetInfostream(), but is has no timestamps, so it is very difficult to debug. Is there a way to force Lucene to add timestamps to the InfoStream?

 

Thanks for any help

 

Best regards

Luis

 

 



RE: Commit / Optimize taking too long.

Posted by Digy <di...@gmail.com>.
No. :-(
DIGY

-----Original Message-----
From: Luis Fco. Ramriez Daza Glez [mailto:luisfco_w@yahoo.com.mx] 
Sent: Wednesday, March 03, 2010 9:48 PM
To: lucene-net-user@lucene.apache.org
Subject: RE: Commit / Optimize taking too long.

Hi Digy

Thanks, that’s what I had t do.
Also I don’t think there is a way to filter what is written to the infostream right? Something like SourceLevels

Best regards
Luis

> -----Original Message-----
> From: Digy [mailto:digydigy@gmail.com]
> Sent: Wednesday, March 03, 2010 1:34 PM
> To: lucene-net-user@lucene.apache.org
> Subject: RE: Commit / Optimize taking too long.
> 
> > Another question. I set the IndexWriter.SetInfostream(), but is has
> no timestamps, so it is very difficult to debug. Is there a way to
> force Lucene to add timestamps to the InfoStream?
> 
> You can write a class inheriting from StreamWriter, override the
> necessary WriteLine methods and pass to SetInfoStream.
> 
> DIGY
> 
> -----Original Message-----
> From: Luis Fco. Ramriez Daza Glez [mailto:luisfco_w@yahoo.com.mx]
> Sent: Wednesday, March 03, 2010 6:05 AM
> To: lucene-net-user@lucene.apache.org
> Subject: Commit / Optimize taking too long.
> 
> Hi all
> 
> 
> 
> We have a problem with some of our programs with the Commit and/or
> Close that takes too long.
> 
> We are using interactive indexing of a database, so the index is being
> constantly updated. The updates are done in the network, not in the PC
> where the index folder is physically located.
> 
> After the program that performs the updates is loaded the Commits takes
> less than a second. But after a few updates (about 10 – 15) the Commits
> start to take very long, up to 10 minutes for just one document added.
> 
> I think it could be because of the merges, that’s why it start taking
> long after 10 or 15 updates.
> 
> It is very strange because indexing our whole DB only takes less than 5
> minutes, it has aprox 500,000records.
> 
> 
> 
> Also we are searching the index while the updates are being done, so
> there are always readers/Searchers open while the Writer is commiting.
> Could this be cause of the Commit taking that long?
> 
> If the cause of the problem are the merges, should I increase the
> MergeFactor (currently using the default of 10) or decrease it, so
> there are fewer merges.
> 
> Our index could be optimized every day so if the merges are not done
> while updating the index they can be done at the end of the day, so all
> merges are done.
> 
> 
> 
> Another question. I set the IndexWriter.SetInfostream(), but is has no
> timestamps, so it is very difficult to debug. Is there a way to force
> Lucene to add timestamps to the InfoStream?
> 
> 
> 
> Thanks for any help
> 
> 
> 
> Best regards
> 
> Luis
> 
> 
> 
> 




RE: Commit / Optimize taking too long.

Posted by "Luis Fco. Ramriez Daza Glez" <lu...@yahoo.com.mx>.
Hi Digy

Thanks, that’s what I had t do.
Also I don’t think there is a way to filter what is written to the infostream right? Something like SourceLevels

Best regards
Luis

> -----Original Message-----
> From: Digy [mailto:digydigy@gmail.com]
> Sent: Wednesday, March 03, 2010 1:34 PM
> To: lucene-net-user@lucene.apache.org
> Subject: RE: Commit / Optimize taking too long.
> 
> > Another question. I set the IndexWriter.SetInfostream(), but is has
> no timestamps, so it is very difficult to debug. Is there a way to
> force Lucene to add timestamps to the InfoStream?
> 
> You can write a class inheriting from StreamWriter, override the
> necessary WriteLine methods and pass to SetInfoStream.
> 
> DIGY
> 
> -----Original Message-----
> From: Luis Fco. Ramriez Daza Glez [mailto:luisfco_w@yahoo.com.mx]
> Sent: Wednesday, March 03, 2010 6:05 AM
> To: lucene-net-user@lucene.apache.org
> Subject: Commit / Optimize taking too long.
> 
> Hi all
> 
> 
> 
> We have a problem with some of our programs with the Commit and/or
> Close that takes too long.
> 
> We are using interactive indexing of a database, so the index is being
> constantly updated. The updates are done in the network, not in the PC
> where the index folder is physically located.
> 
> After the program that performs the updates is loaded the Commits takes
> less than a second. But after a few updates (about 10 – 15) the Commits
> start to take very long, up to 10 minutes for just one document added.
> 
> I think it could be because of the merges, that’s why it start taking
> long after 10 or 15 updates.
> 
> It is very strange because indexing our whole DB only takes less than 5
> minutes, it has aprox 500,000records.
> 
> 
> 
> Also we are searching the index while the updates are being done, so
> there are always readers/Searchers open while the Writer is commiting.
> Could this be cause of the Commit taking that long?
> 
> If the cause of the problem are the merges, should I increase the
> MergeFactor (currently using the default of 10) or decrease it, so
> there are fewer merges.
> 
> Our index could be optimized every day so if the merges are not done
> while updating the index they can be done at the end of the day, so all
> merges are done.
> 
> 
> 
> Another question. I set the IndexWriter.SetInfostream(), but is has no
> timestamps, so it is very difficult to debug. Is there a way to force
> Lucene to add timestamps to the InfoStream?
> 
> 
> 
> Thanks for any help
> 
> 
> 
> Best regards
> 
> Luis
> 
> 
> 
> 




RE: Commit / Optimize taking too long.

Posted by Digy <di...@gmail.com>.
> Another question. I set the IndexWriter.SetInfostream(), but is has no timestamps, so it is very difficult to debug. Is there a way to force Lucene to add timestamps to the InfoStream?

You can write a class inheriting from StreamWriter, override the necessary WriteLine methods and pass to SetInfoStream.

DIGY

-----Original Message-----
From: Luis Fco. Ramriez Daza Glez [mailto:luisfco_w@yahoo.com.mx] 
Sent: Wednesday, March 03, 2010 6:05 AM
To: lucene-net-user@lucene.apache.org
Subject: Commit / Optimize taking too long.

Hi all

 

We have a problem with some of our programs with the Commit and/or Close that takes too long.

We are using interactive indexing of a database, so the index is being constantly updated. The updates are done in the network, not in the PC where the index folder is physically located.

After the program that performs the updates is loaded the Commits takes less than a second. But after a few updates (about 10 – 15) the Commits start to take very long, up to 10 minutes for just one document added.

I think it could be because of the merges, that’s why it start taking long after 10 or 15 updates.

It is very strange because indexing our whole DB only takes less than 5 minutes, it has aprox 500,000records.

 

Also we are searching the index while the updates are being done, so there are always readers/Searchers open while the Writer is commiting. Could this be cause of the Commit taking that long?

If the cause of the problem are the merges, should I increase the MergeFactor (currently using the default of 10) or decrease it, so there are fewer merges.

Our index could be optimized every day so if the merges are not done while updating the index they can be done at the end of the day, so all merges are done.

 

Another question. I set the IndexWriter.SetInfostream(), but is has no timestamps, so it is very difficult to debug. Is there a way to force Lucene to add timestamps to the InfoStream?

 

Thanks for any help

 

Best regards

Luis